BlockSci – A blockchain analysis platform

BlockSci Documentation

Documentation is available for the python interface library.

Additionally, a demonstration Notebook is available in the Notebooks folder.

For installation instructions, see below. More detailed documentation is coming soon. Meanwhile, feel free to contact us at blocksci@lists.cs.princeton.edu.

https://github.com/citp/BlockSci

tiny-dnn is a C++11 implementation of deep learning

  • reasonably fast, without GPU
    • with TBB threading and SSE/AVX vectorization
    • 98.8% accuracy on MNIST in 13 minutes training (@Core i7-3520M)
  • portable & header-only
    • Run anywhere as long as you have a compiler which supports C++11
    • Just include tiny_dnn.h and write your model in C++. There is nothing to install.
  • easy to integrate with real applications
    • no output to stdout/stderr
    • a constant throughput (simple parallelization model, no garbage collection)
    • work without throwing an exception
    • can import caffe’s model
  • simply implemented
    • be a good library for learning neural networks

https://github.com/tiny-dnn/tiny-dnn#examples

uThreads: Concurrent User Threads in C++(and C)

uThreads is a concurrent library based on cooperative scheduling of user-level threads(fibers) implemented in C++. User-level threads are lightweight threads that execute on top of kernel threads to provide concurrency as well as parallelism. Kernel threads are necessary to utilize processors, but they come with the following drawbacks:

  • Each suspend/resume operation involves a kernel context switch
  • Thread preemption causes additional overhead
  • Thread priorities and advanced scheduling causes additional overhead

Cooperative user-level threads, on the other hand, provide light weight context switches and omit the additional overhead of preemption and kernel scheduling. Most Operating Systems only support a 1:1 thread mapping (1 user-level thread to 1 kernel-level thread), where multiple kernel threads execute at the same time to utilize multiple cores and provide parallelism. e.g., Linux supports only 1:1 thread mapping. There is also N:1 thread mapping, where multiple user-level threads can be mapped to a single kernel-level thread. The kernel thread is not aware of the user-level threads existence. For example, Facebook’s folly::fiber, libmill, and libtask use N:1 mapping. Having N:1 mapping means if the application blocks at the kernel level, all user-level threads are blocked and application cannot move forward. One way to address this is to only block on user level, hence, blocking user-level threads. This setting works very well with IO bound applications, however, if a user thread requires using a CPU for a while, it can block other user threads and the task is better to be executed asynchronously on another core to prevent this from happening. In order to avoid this problem, user threads can be mapped to multiple kernel-level threads. Thus, creating the third scenario with M:N or hybrid mapping. e.g., go and uC++ use M:N mapping.

uThreads supports M:N mapping of uThreads (user-level threads) over kThreads (kernel-level threads) with cooperative scheduling. kThreads can be grouped together by Clusters, and uThreads can migrate among Clusters. Figure 1 shows the structure of an application implemented using uThreads using a single ReadyQueue Scheduler. You can find the documentation here http://samanbarghi.github.io/uThreads.

architecture.png

Figure 1: uThreads Architecture

Designing a fast Hash Table

This article describes the many design decisions that go into creating a fast, general-purpose hash table. It culminates with a benchmark between my own emilib::HashSet and C++11’s std::unordered_set. If you are interested in hash tables and designing one yourself (no matter which language you are programming in), this article might be for you.

http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table

ArangoDB is designed as a native multi-model database

ArangoDB is a distributed and highly scalable database for all data models. ArangoDB is fully-certified for DC/OS including persistent primitives. Setup and maintenance of a cluster is extremely easy.

Key Features in a Nutshell

Document

  • JOINs
  • Transactions
  • Schemaless
  • JSON Objects
  • Secondary Indexes
  • Compact Storage

Graph

Cluster

ArangoDB NoSQL Multi-Model Database: Graph, Document, Key/Value


https://github.com/joowani/python-arango

mlpack is a scalable machine learning library, written in C++

mlpack is a C++ machine learning library with emphasis on scalability, speed, and ease-of-use. Its aim is to make machine learning possible for novice users by means of a simple, consistent API, while simultaneously exploiting C++ language features to provide maximum performance and maximum flexibility for expert users. This is done by providing a set of command-line executables which can be used as black boxes, and a modular C++ API for expert users and researchers to easily make changes to the internals of the algorithms.

As a result of this approach, mlpack outperforms competing machine learning libraries by large margins; see the BigLearning workshop paper and the benchmarks for details.

mlpack is developed by contributors from around the world. It is released free of charge, under the 3-clause BSD License (more information). (Versions older than 1.0.12 were released under the GNU Lesser General Public License:LGPL, version 3.)

mlpack was originally presented at the BigLearning workshop of NIPS 2011[pdf] and later published in the Journal of Machine Learning Research [pdf]. Please cite mlpack in your work using this citation.

mlpack bindings for R are provided by the RcppMLPACK project.

http://mlpack.org/

RaftLib – Simple, easy to use stream parallel computation library for C++

Stream processing is a method for extracting data, task, and pipeline parallelism from an application. Ever wash dishes with more than one person, one washer, one dryer, and perhaps one person to put them away? That’s pipeline parallelism. Raft lets you do that for your application. How about add two washers, three? You can do this as well. That’s data parallelism, you’re operating on differing (independent) dishes at the same time. Going further, the Raft library enables task parallelism. That is, if you have four people on the dishes, you can add somebody on the side, taking inventory of the dishes as the pass by, at the same time the dishes are being washed. The inventory task is independent of the other two (which is slightly more nuanced since we’ve to spot the dishes as they go by). The Raft library uses templates, and a compiled library, to make the otherwise perilous task of creating parallel programs far easier.

http://www.raftlib.io/

Webinar Recording: Design Patterns and Modern C++

The recording of our May 24th webinar, Design Patterns and Modern C++, is now available on JetBrainsTV YouTube channel.

In this webinar, Dmitri Nesteruk shows how the classical Design Patterns can be applied to Modern C++. He is covering both their canonical implementations as well as possible improvements.

Demo project is available on GitHub. And if you have any suggestions or improvements, your pull requests are welcome!

http://blog.jetbrains.com/clion/2016/05/webinar-recording-design-patterns-and-modern-cpp/