This blog post explains how computers running the Linux kernel receive packets, as well as how to monitor and tune each component of the networking stack as packets flow from the network toward userland programs.
We’ve designed a distributed system for sharing enormous datasets – for researchers, by researchers. The result is a scalable, secure, and fault-tolerant repository for data, with blazing fast download speeds
This article describes the many design decisions that go into creating a fast, general-purpose hash table. It culminates with a benchmark between my own
emilib::HashSet and C++11’s
std::unordered_set. If you are interested in hash tables and designing one yourself (no matter which language you are programming in), this article might be for you.
ArangoDB is a distributed and highly scalable database for all data models. ArangoDB is fully-certified for DC/OS including persistent primitives. Setup and maintenance of a cluster is extremely easy.
Key Features in a Nutshell
Why should I care?
Data Structures might not be the juiciest topic in the world, but they are hugely important to growing as an engineer. Knowing data structures don’t just make your programs faster and more efficient, but they help you organize your code and your thoughts so that you can build more complicated programs without a ton of mental overhead.
Yeah, lots of computer science topics are intimidating, and that’s largely a fault of how they are taught. In this we’re going to do a high level pass over a lot of the key things you need to know in order to dive into them deeper. It’s more about introducing you to the shared language of data structures.
Awesome! Head on over to the itsy-bitsy-data-structures.js file.
Every day, people rely on a wide variety of sources to stay informed — from news stories to social media posts to search results. Being able to develop Machine Learning models that can automatically deliver accurate summaries of longer text can be useful for digesting such large amounts of information in a compressed form, and is a long-term goal of the Google Brain team.
Summarization can also serve as an interesting reading comprehension test for machines. To summarize well, machine learning models need to be able to comprehend documents and distill the important information, tasks which are highly challenging for computers, especially as the lengths of the documents increases.
In an effort to push this research forward, we’re open-sourcing TensorFlow model code for the task of generating news headlines on Annotated English Gigaword, a dataset often used in summarization research. We also specify the hyper-parameters in the documentation that achieve better than published state-of-the-art on the most commonly used metric as of the time of writing. Below we also provide samples generated by the model.
In an industry where so many people want to change the world, it’s fair to say that low cost object storage has done just that. Building a business that requires flexible low-latency storage is now affordable in a way we couldn’t imagine before.
When building the Exoscale public cloud offering, we knew that a simple object storage service, protected by Swiss privacy laws, would be crucial. After looking at the existing object storage software projects, we decided to build our own solution: Pithos.
Pithos is an open source S3-compatability layer for Apache Cassandra, the column database. In other words, it allows you to use standard S3 tools to store objects in your own Cassandra cluster. If this is the first time that you’ve looked at object storage software then you may wonder why Pithos is built on top of a NoSQL database but it’s not all that unusual.