“Recently, I wrote how we do classification at CB Insights. The post outlines some of the things that I have been thinking about how to apply machine learning for a given problem along with the process that we adopted for the classification problem at CB Insights, but also gave me a good opportunity to reflect even further about the machine learning process; shortcomings of papers, books and even traditional education system when it comes to teach the machine learning.
My aim is not to focus on the algorithms, methods or classifiers but rather to offer a broader picture on how to approach a machine learning problem, and in the meantime give couple of bad advices. I will offer my bad advice for a classification problem(algorithm=classifier) and be warned that they may generalize better than your favorite classifier.(I will try not to overfit, but let me know if I do so in the comments.)…”
“The first time I heard of the Paxos algorithm was during my bachelor’s degree way back in 2004, when I participated in a Distributed Algorithms course. In the past few years Paxos came up multiple times, usually in the context of a robust implementation of some scalable storage system. It is almost always uttered in awe, as to say “that unbelievably complex algorithm, the Paxos. Beware!”. I’ve decided to reread the original paper, and try to explain what Paxos does, and how it does it. In a nutshell, Paxos solves the problem of resiliently replicating an increasingly growing (ordered) list of items…”
“The previous post gave a general overview of the Paxos algorithm. Here’s a quick recap: Paxos implements a resilient distributed log, such that items can be added and each item is assigned a unique (and increasing) index. The algorithm can be split into three main blocks: a leader election, a consensus on a single item (also called the Synod algorithm) and managing the entire log. In this post I want to go into more depth about the Synod algorithm…”
“If you have ever written some TCP socket code, you may have wondered: “What will happen to my connection if the network cable is unplugged or the remote machine crashes?”.
The short answer is: nothing. The remote end of the connection won’t be able to send a FIN packet, and the local OS will not detect that the connection is lost. So it’s up to you as the developer to address this scenario…”
“People seem to think that writing a garbage collector is really hard, a deep magic understood by a few great sages and Hans Boehm (et al). Well it’s not. In fact, it’s rather straight forward. I claim that the hardest part in writing a GC is writing the memory allocator, which is as hard to write as it is to look up the malloc example in K&R.
A few important things to note before we begin. First, our code will be dependent on the Linux kernel. Not GNU/Linux, but the Linux kernel. Secondly, our code will be 32-bit and not one bit more. Thirdly. Please don’t use this code. I did not intend for it to be wholly correct and there may be subtle bugs I did not catch. Regardless, the ideas themselves are still correct. Now, let’s get started…”
“By Avril Coghlan, Parasite Genomics Group, Wellcome Trust Sanger Institute, Cambridge, U.K. Email: firstname.lastname@example.org
This is a simple introduction to time series analysis using the R statistics software.
There is a pdf version of this booklet available at https://media.readthedocs.org/pdf/a-little-book-of-r-for-time-series/latest/a-little-book-of-r-for-time-series.pdf.
If you like this booklet, you may also like to check out my booklet on using R for biomedical statistics, http://a-little-book-of-r-for-biomedical-statistics.readthedocs.org/, and my booklet on using R for multivariate analysis, http://little-book-of-r-for-multivariate-analysis.readthedocs.org/…”
“In practical programming projects, Lisps are rare, and Racket especially so. Thus, before I embarked on my Lisp adventure, I wanted to understand the costs & benefits of using a Lisp. Why do Lisps have such a great reputation, but so few users? Was I seeing something everyone else missed? Or did they know something I didn’t? To find out, I read whatever I could find about Lisps, including Paul Graham’s Hackers & Painters and Peter Seibel’s Practical Common Lisp…”
“nope.c is a C language-based software platform for scalable server-side and networking applications. Think node.js for C programmers…”
“Emacs, thanks to major and minor modes, can be transformed into a powerful specific development environment.
According to the kind of files you are editing (or a command you type), it can change and give you specific features that fit the need of a programming language (or a task like debugging, launching unit tests, …”
“REST is a vast improvement over complex things like SOAP and CORBA, but I think we still have a way to go before we’ve reached simple. REST is an acronym for REpresentational State Transfer, and I think the “state” part of that acronym gives rise to a lot of incidental complexity as systems grow.
You can think of state as a combination of value and time, and in the RESTful case, the time dimension is almost always “now”. The trouble then comes the absence of a coordinated notion of time.
Almost every program I write today depends on at least one RESTful service, and my program is just one component in an ensemble. As we develop systems that call systems that call systems, what are the odds that everybody participating in the ensemble has the same notion of time? What happens when systems have different notions of time? When we are talking about services in the small, it’s not a much of a concern, but for networked systems in the large, conflating value and time into state makes our systems increasingly difficult to reason about…”
“Dyad.c is an asynchronous networking library which aims to be lightweight, portable and easy to use. It can be used both to create small standalone servers and to provide network support to existing projects…”