tiny-dnn is a C++11 implementation of deep learning

  • reasonably fast, without GPU
    • with TBB threading and SSE/AVX vectorization
    • 98.8% accuracy on MNIST in 13 minutes training (@Core i7-3520M)
  • portable & header-only
    • Run anywhere as long as you have a compiler which supports C++11
    • Just include tiny_dnn.h and write your model in C++. There is nothing to install.
  • easy to integrate with real applications
    • no output to stdout/stderr
    • a constant throughput (simple parallelization model, no garbage collection)
    • work without throwing an exception
    • can import caffe’s model
  • simply implemented
    • be a good library for learning neural networks

https://github.com/tiny-dnn/tiny-dnn#examples

AWS Storage Gateway provides a file interface to objects in your Amazon S3 buckets

AWS Storage Gateway now provides a virtual on-premises file server, which enables you to store and retrieve Amazon S3 objects through standard file storage protocols. With file gateway, existing applications or devices can use secure and durable cloud storage without needing to be modified. File gateway simplifies moving data into S3 for in-cloud workloads, provides cost-effective storage for backup and archive, or expands your on-premises storage into the cloud.

File gateway is available as a virtual machine image which you download from the AWS Management Console. Once deployed in your data center and associated with your AWS account, your configured S3 buckets will be available as Network File System (NFS) mount points. Your applications read and write files and directories over NFS, interfacing to the gateway as a file server. In turn, the gateway translates these file operations into object requests on your S3 buckets. Like existing volume and tape gateways, your most recently used data is cached on the gateway for low-latency access, and data transfer between your data center and AWS is fully managed and optimized by the gateway. Once in S3, you can access the objects directly or manage them using features such as S3 Lifecycle Policies, object versioning, and cross-region replication.

To start using the new AWS Storage Gateway, click here. There are no up-front commitments required and you pay only for what you use. To learn more, click here.

https://aws.amazon.com/about-aws/whats-new/2016/11/aws-storage-gateway-provides-a-file-interface-to-objects-in-your-amazon-s3-buckets/

Dockerizing MySQL at Uber Engineering

Uber Engineering’s Schemaless storage system powers some of the biggest services at Uber, such as Mezzanine. Schemaless is a scalable and highly available datastore on top ofMySQL¹ clusters. Managing these clusters was fairly easy when we had 16 clusters. These days, we have more than 1,000 clusters containing more than 4,000 database servers, and that requires a different class of tooling.

Initially, all our clusters were managed by Puppet, a lot of ad hoc scripts, and manual operations that couldn’t scale at Uber’s pace. When we began looking for a better way to manage our increasing number of MySQL clusters, we had a few basic requirements:

  • Run multiple database processes on each host
  • Automate everything
  • Have a single entry point to manage and monitor all clusters across all data center

The solution we came up with is a design called Schemadock. We run MySQL in Dockercontainers, which are managed by goal states that define cluster topologies in configuration files. Cluster topologies specify how MySQL clusters should look; for example, that there should be a Cluster A with 3 databases, and which one should be the master. Agents then apply those topologies on the individual databases. A centralized service maintains and monitors the goal state for each instance and reacts to any deviations.

Schemadock has many components, and Docker is a small but significant one. Switching to a more scalable solution has been a momentous effort, and this article explains how Docker helped us get here.

https://eng.uber.com/dockerizing-mysql/

How do I develop Trading Systems?

Often, trading model developers “spoil” the eventual results of their model by making errors early in the process. These errors could be using poorly-collected data, not accounting for survivorship bias, or testing too many specifications of a similar model. Data snooping such as that can be particularly costly in that it is an error that cannot be reversed. Therefore, if you haven’t yet begun the active work of acquiring data, specifying a model, or backtesting, then you have the opportunity to conduct the testing and development process optimally.

https://www.linkedin.com/pulse/how-do-i-develop-trading-systems-ariel

Binary Data Now Supported by API Gateway

You can now send and receive binary data through API endpoints hosted on Amazon API Gateway. Binary data can either pass through directly or be converted to Base64 encoded text.

You can specify media types (e.g., image/png, application/octet-stream, etc.) to treat as binary through the API Gateway console or APIs. Web or mobile clients can then leverage standard HTTP headers (Content-type and Accept) to declare what they are sending or expecting to receive. You can also configure API Gateway to either pass through, convert to text (Base 64 encoding), or convert to binary (Base64 decoding) the API integration request and response body.

MXNet – Deep Learning Framework of Choice at AWS

Machine learning is playing an increasingly important role in many areas of our businesses and our lives and is being employed in a range of computing tasks where programming explicit algorithms is infeasible.

At Amazon, machine learning has been key to many of our business processes, from recommendations to fraud detection, from inventory levels to book classification to abusive review detection. And there are many more application areas where we use machine learning extensively: search, autonomous drones, robotics in fulfillment centers, text and speech recognitions, etc.

Among machine learning algorithms, a class of algorithms called deep learning hascome to represent those algorithms that can absorb huge volumes of data and learn elegant and useful patterns within that data: faces inside photos, the meaning of a text, or the intent of a spoken word. A set of programming models has emerged to help developers define and train AI models with deep learning; along with open source frameworks that put deep learning in the hands of mere mortals. Some examples of popular deep learning frameworks that we support on AWS include Caffe, CNTK, MXNet, TensorFlow, Theano, and Torch.

Among all these popular frameworks, we have concluded that MXNet is the most scalable framework. We believe that the AI community would benefit from putting more effort behind MXNet. Today, we are announcing that MXNet will be our deep learning framework of choice. AWS will contribute code and improved documentation as well as invest in the ecosystem around MXNet. We will partner with other organizations to further advance MXNet.

http://www.allthingsdistributed.com/2016/11/mxnet-default-framework-deep-learning-aws.html