In this talk Ben Darnell, the CTO and Co-Founder of Cockroach Labs, discusses the decision to utilize Go in CockroachDB. Ben shares how CockroachDB optimized its memory usage to mitigate issues related to garbage collection and improved its use of channels to avoid deadlocks. Ben also shares creative techniques to integrate non-Go dependencies into the Go build process.
Garbage collection in Go can cause an application to pause which is a concerning issue, but Go also makes a lot of manual tweaks available that allow contol of what actually ends up on top of the garbage heap. Here are two of the optimizations made by CockroachDB to mitigate garbage collection issues:
By vitrue of these two practices (which you can see examples of in the video) CockroachDB sees in Go’s benchmarking tools that no new allocations are done per iteration. Everything is allocated up front and cached.
A fallacy is the use of invalid or faulty reasoning in an argument.
There are two broad types of logical fallacies: formal and informal.
A formal fallacy describes a flaw in the construction of a deductive argument, while an informal fallacy describes an error in reasoning.
In arguments, few things are more frustrating than when you realize that someone is using bad logic, but you can’t quite identify what the problem is.
This rarely happens with the more well-known logical fallacies. For example, when someone in an argument starts criticizing the other person’s reputation instead of their ideas, most people know that’s an ad hominem attack. Or, when someone compares two things to support their argument, but it doesn’t make sense, that’s a false equivalency. But other fallacies are harder to spot. For example, say you’re arguing about politics with a friend, and they say:
“The far-left is crazy. The far-right is violent. That’s why the right answers lie the middle.”
Sure, it might be true that moderation is the answer. But just because two extremes exist doesn’t mean that the truth necessarily lies between those extremes. Put more starkly: If one person says the sky is blue, but someone else says it’s yellow, that doesn’t mean the sky is green. This is an argument to moderation, or the middle ground fallacy — you hear it a lot from people who are trying to mediate conflicts.
When you find yourself in arguments, it’s valuable to be able to spot and, if necessary, call out logical fallacies like this. It can protect you against bad ideas. Check out a few more examples of logical fallacies that can be tough to spot…
“This isn’t my first rodeo with large datasets. The authentication and product management database that I have designed for the largest UK public Wi-Fi provider had impressive volumes too. We were tracking authentication for millions of devices daily. However, that project had a funding that allowed us to pick any hardware, any supporting services and hire any DBAs to assist with replication/data warehousing/troubleshooting. Furthermore, all analytics queries/reporting were done off logical replicas and there were multiple sysadmins that looked after the supporting infrastructure. Whereas this was a venture of my own, with limited funding and 20x the volume.
This is not to say that if we did have loadsamoney we would have spent it on purchasing top-of-the-line hardware, flashy monitoring systems or DBAs (Okay, maybe having a dedicated DBA would have been nice). Over many years of consulting I have developed a view that the root of all evil lies in the unnecessarily complex data processing pipeline. You don’t need a message queue for ETL and you don’t need an application-layer cache for database queries. More often than not, these are workarounds for the underlying database issues (e.g. latency, poor indexing strategy) that create more issues down the line. In ideal scenario, you want to have all data contained within a single database and all data loading operations abstracted into atomic transactions. My goal was not to repeat these mistakes.
As you have already guessed, our PostgreSQL database became the central piece of the business (aptly called ‘mother’, although my co-founder insists that me calling various infrastructure components ‘mother’, ‘mothership’, ‘motherland’, etc is worrying). We don’t have a standalone message queue service, cache service or replicas for data warehousing. Instead of maintaining the supporting infrastructure, I have dedicated my efforts to eliminating any bottlenecks by minimizing latency, provisioning the most suitable hardware, and carefully planning the database schema. What we have is an easy to scale infrastructure with a single database and many data processing agents. I love the simplicity of it — if something breaks, we can pin point and fix the issue within minutes. However, a lot of mistakes were done along the way — this articles summarizes some of them…”
Breadth First Search is one of those fundamental graphs algorithms and also is foundation to many other problems. The basic idea is to move through a graph starting from a source node and visit everything one layer at a time.
They are popular and they are misunderstood. Containers have become the default way applications are packaged and run on servers, initially popularized by Docker. Now, Docker itself is misunderstood. It is the name of a company and a command (a suite of commands, rather) that allow you to manage containers (create, run, delete, network) easily. Containers themselves however, are created from a set of operating system primitives. In this article, we shall concern ourselves with containers on the Linux operating system and simply act as though containers on Windows do not exist at all.
There is no single system call under Linux that creates containers. They are a loose construct made by utilizing Linux namespaces and control groups or cgroups…