The tech behind our time series graphs – 2bn docs per day, 30TB per month

“Server Density processes over 30TB/month of incoming data points from the servers and web checks we monitor for our customers, ranging from simple Linux system load average to website response times from 18 different countries. All of this data goes into MongoDB in real time and is pulled out when customers need to view graphs, update dashboards and generate reports…”

https://blog.serverdensity.com/tech-behind-time-series-graphs-2bn-docs-per-day-30tb-per-month/

Introducing the MongoDB Driver for the Rust Programming Language

“The driver is Apache licensed, and is implemented purely in Rust without relying on any C extensions or libraries (with the exception of an embedded copy of md5, which can be removed when Rust offers a native implementation).

Using the driver will feel familiar to users of current MongoDB drivers. Basic BSON types, such as strings, numbers, and objects, have a built-in internal representation based around Rust’s algebraic data types. Users can also implement a trait (analogous to a Haskell typeclass, similar to a Java interface) which allows them to treat types native to their codebase as though they were native to BSON as well.

Once a user’s data is formatted, interacting with the database is done through familiar objects like Collection, DB, Cursor, and Client. These objects have similar APIs to their counterparts in other languages, and presently offer CRUD, indexing, as well as administrative functionality in a framework that is Rustically abstracted and balances the philosophy of Rust’s static guarantees with the flexibility of MongoDB. A small example is offered below…”

http://blog.mongodb.org/post/56426792420/introducing-the-mongodb-driver-for-the-rust-programming

Harnessing the power of Twitter and MongoDB

“With open-source technologies proliferating as “Big Data” and analytics explode, we thought it would be beneficial to let our users and friends utilize a script that takes care of the nitty gritty and allows them to explore what makes MongoDB great. We’re excited to present Twitter-Harvest, a Python script that utilizes the Twitter REST API v1.1 to retrieve tweets from a user’s timeline and inserts them into a MongoDB database…”

http://blog.mongolab.com/2013/08/harnessing-the-power-of-twitter-and-mongodb/

Jepsen: Testing the Partition Tolerance of PostgreSQL, Redis, MongoDB and Riak

“Distributed systems are characterized by exchanging state over high-latency or unreliable links. The system must be robust to both node and network failure if it is to operate reliably–however, not all systems satisfy the safety invariants we’d like. In this article, we’ll explore some of the design considerations of distributed databases, and how they respond to network partitions.

IP networks may arbitrarily drop, delay, reorder, or duplicate messages send between nodes–so many distributed systems use TCP to prevent reordered and duplicate messages. However, TCP/IP is still fundamentally asynchronous: the network may arbitrarily delay messages, and connections may drop at any time. Moreover, failure detection is unreliable: it may be impossible to determine whether a node has died, the network connection has dropped, or things are just slower than expected…”

http://www.infoq.com/articles/jepsen

Why MongoDB is a bad choice for storing our scraped data

“MongoDB was used early on at Scrapinghub to store scraped data because it’s convenient. Scraped data is represented as (possibly nested) records which can be serialized to JSON. The schema is not known ahead of time and may change from one job to the next. We need to support browsing, querying and downloading the stored data. This was very easy to implement using MongoDB (easier than the alternatives available a few years ago) and it worked well for some time.

Usage has grown from a simple store for scraped data used on a few projects to the back end of our Scrapy Cloud platform. Now we are experiencing limitations with our current architecture and rather than continue to work with MongoDB, we have decided to move to a different technology (more in a later blog post). Many customers are surprised to hear that we are moving away from MongoDB, I hope this blog post helps explain why it didn’t work for us…”

http://blog.scrapinghub.com/2013/05/13/mongo-bad-for-scraped-data/

Playing around with Lua/NginX, Python, MongoDB, Tornado and JQuery…

stock-labs is a “stock visualization tool” built using different backends and technologies (only for training and study purposes):
1. LuaOpenRestyHighstock with JQuery and MySQL (main focus on Lua/Nginx integration)
2. TornadoHighstock with JQuery and MongoDB (main focus on Tornado(async) and MongoDB async and sync drivers)

https://github.com/irr/stock-labs

MongoDB: How to limit results and how to page through results

“In this post we are going to take a look at how to limit results in MongoDB as well how to page through results. MongoDB use limit to limit the number of results return, MongoDB use skip to skip a number of records from the results set. Using limit in conjunction with skip enables you to do paging in MongoDB…”

http://blogs.lessthandot.com/index.php/DataMgmt/DBAdmin/MSSQLServerAdmin/mongodb-how-to-limit-results