An ex-Googler’s guide to dev tools

Many years ago, I did a brief stint at Google. A lot has changed since then, but even that brief exposure to Google’s internal developer tools left a lasting impression on me. In many ways, the dev tools inside Google are the most advanced in the world. Google has been a pioneer not only in scaling their own software systems but in figuring out how to build software effectively at scale. They’ve dealt with issues related to codebase volume, code discoverability, organizational knowledge sharing, and multi-service deployment at a level of sophistication that most other companies have not yet reached. (For reference, see Software Engineering at Google.)

In other ways, however, Google’s internal tools are awfully limited. In particular, nearly all of them are tightly coupled with Google’s unique internal ecosystem. Unfortunately, that means you can’t take them with you when you leave.

The Google diaspora has seeded so many other organizations with amazing talented people who bring lessons learned from working inside one of the world’s leading technology organizations. But adapting to programming outside of Google can be tough, especially when you’ve come to rely on tools you no longer have at your disposal.

Over the years, I’ve learned from my own experience and the experience of lots of others who have left Google. Many of Sourcegraph’s early customers began with an ex-Googler missing code search after leaving Google. I worked closely with these people to understand the gap they were trying to fill, so that we could build Sourcegraph to meet their needs. Over time, patterns emerged in terms of how ex-Googlers sought to introduce new dev tools into their organizations, inspired by their experience with dev tools at Google. Some were successful and others were not.

I thought it would be helpful to write a guide to dev tools outside of Google for the ex-Googler, written with an eye toward pragmatism and practicality. No doubt many ex-Googlers wish they could simply clone the dev tools ecosystem inside of Google to their new company, but you can’t boil the ocean. Here is my take on where you should start and a general path I think ex-Googlers can take to find the tools that will make them—and their new teams—as productive as possible…

https://about.sourcegraph.com/blog/ex-googler-guide-dev-tools/

Dgraph on AWS: Setting up a horizontally scalable graph database

Dgraph is an open source, distributed graph database, built for production environments, and written entirely in Go. Dgraph is fast, transactional, sharded, and distributed (joins, filters, sorts), consistently replicated with Raft, and provides fault tolerance with synchronous replication and horizontal scalability.

The language used to interact with Dgraph is GraphQL and our variant called GraphQL+-. This gives apps access to the benefits of GraphQL directly from the database.

Dgraph has client integrations with official clients in Go, Java, Python, JavaScript, and C#; and community-supported clients with Dart, Rust, and Elixir. Dgraph users also can use any of the tools and libraries that work with GraphQL.

To get started right away, download Dgraph and follow the quick-start guide.

Getting started with Dgraph locally on your own computer, where you can quickly model your data in Dgraph and build your app, is easy. When you’re ready to deploy this to a production environment, you’ll want to deploy Dgraph to the cloud. You can horizontally scale Dgraph across multiple machines for high availability and data sharding.

In this article, we’ll show how to set up a resilient highly available Dgraph cluster on AWS.

https://aws.amazon.com/blogs/opensource/dgraph-on-aws-setting-up-a-horizontally-scalable-graph-database/

Learning Math for Machine Learning

Vincent Chen is a student at Stanford University studying Computer Science. He is also a Research Assistant at the Stanford AI Lab.


It’s not entirely clear what level of mathematics is necessary to get started in machine learning, especially for those who didn’t study math or statistics in school.

In this piece, my goal is to suggest the mathematical background necessary to build products or conduct academic research in machine learning. These suggestions are derived from conversations with machine learning engineers, researchers, and educators, as well as my own experiences in both machine learning research and industry roles.

To frame the math prerequisites, I first propose different mindsets and strategies for approaching your math education outside of traditional classroom settings. Then, I outline the specific backgrounds necessary for different kinds of machine learning work, as these subjects range from high school-level statistics and calculus to the latest developments in probabilistic graphical models (PGMs). By the end of the post, my hope is that you’ll have a sense of the math education you’ll need to be effective in your machine learning work, whatever that may be!

To preface the piece, I acknowledge that learning styles/frameworks/resources are unique to a learner’s personal needs/goals— your opinions would be appreciated in the discussion on HN!

A Note on Math Anxiety
It turns out that a lot of people — including engineers — are scared of math. To begin, I want to address the myth of “being good at math.”

The truth is, people who are good at math have lots of practice doing math. As a result, they’re comfortable being stuck while doing math. A student’s mindset, as opposed to innate ability, is the primary predictor of one’s ability to learn math (as shown by recent studies).

To be clear, it will take time and effort to achieve this state of comfort, but it’s certainly not something you’re born with. The rest of this post will help you figure out what level of mathematical foundation you need and outline strategies for building it.

https://www.ycombinator.com/library/51-learning-math-for-machine-learning

structslop: static analyzer for efficient struct packing for Go

TL;DR: at Orijtech, Inc., we’ve developed a first of its kind static analyzer, “structslop” that examines and recommends optimal struct field arrangements in your Go programs; it’ll help you reduce RAM consumed by offending structs, making your programs more efficient! High performance systems require efficiency in every aspect, and our work can help out!

https://medium.com/orijtech-developers/efficient-struct-packing-guided-pass-for-go-92255872ec72

https://github.com/orijtech/structslop

Manual Memory Management in Go using jemalloc

Dgraph Labs has been a user of the Go language since our inception in 2015. Five years and 200K lines of Go code later, we’re happy to report that we are still convinced Go was and remains the right choice. Our excitement for Go has gone beyond building systems, and has led us to even write scripts in Go that would typically be written in Bash or Python. We find that using Go has helped us build a codebase that is clean, readable, maintainable and – most importantly – efficient and concurrent.

However, there’s one area of concern that we have had since the early days: memory management. We have nothing against the Go garbage collector, but while it offers a convenience to developers, it has the same issue that other memory garbage collectors do: it simply cannot compete with the efficiency of manual memory management.

When you manage memory manually, the memory usage is lower, predictable and allows bursts of memory allocation to not cause crazy spikes in memory usage. For Dgraph using Go memory, all of those have been a problem1. In fact, Dgraph running out of memory is a very common complaint we hear from our users.

Languages like Rust have been gaining ground partly because it allows safe manual memory management. We can completely empathize with that.

In our experience, doing manual memory allocation and chasing potential memory leaks takes less effort than trying to optimize memory usage in a language with garbage collection2. Manual memory management is well worth the trouble when building database systems that are capable of virtually unlimited scalability.

Our love of Go and our need to avoid Go GC led us to find novel ways to do manual memory management in Go. Of course, most Go users will never need to do manual memory management; and we would recommend against it unless you need it. And when you do need it, you’ll know.

In this post, I’ll share what we have learned at Dgraph Labs from our exploration of manual memory management, and explain how we manually manage memory in Go.

https://dgraph.io/blog/post/manual-memory-management-golang-jemalloc/