Blockchain Explained in 7 Python Functions

I think for many people out there, Blockchain is this phenomenon, which is hard to get your head around. I started watching videos and reading articles, but for me personally, it wasn’t until I wrote my own simple Blockchain, that I truly understood what it is and the potential applications for it.

The way I think about blockchain is it is an encrypted database that is public. If you were Amazon and you wanted to use the technology to track your stock levels, would using Blockchain make sense? Probably not, since your customers won’t want to expend their resources verifying your blockchain, since they state on their website, ‘Only 1 left!’, anyway.

I’ll leave you to think about future applications. So without further ado, lets set up our 7 functions!


Text Data Preprocessing: A Walkthrough in Python

In a pair of previous posts, we first discussed a framework for approaching textual data science tasks, and followed that up with a discussion on a general approach to preprocessing text data. This post will serve as a practical walkthrough of a text data preprocessing task using some common Python tools.

Data preprocessing
Preprocessing, in the context of the textual data science framework.Our goal is to go from what we will describe as a chunk of text (not to be confused with text chunking), a lengthy, unprocessed single string, and end up with a list (or several lists) of cleaned tokens that would be useful for further text mining and/or natural language processing tasks.

Build a serverless multi-region, active-active backend solution in an hour

In the previous posts, we explored availability and reliability and the needs and means of building a multi-region, active-active architecture on AWS. In this blog post, I will walk you through the steps needed to build and deploy a serverless multi-region, active-active backend.

This actually blows my mind — since I will be able to explain this in one blog post and you should be able to deploy a fully functional backend in about an hour. In contrast, few years ago it would have required a lot more expertise, work, time and money!

Automatically Archive Items to S3 Using DynamoDB Time to Live (TTL) with AWS Lambda and Amazon Kinesis Firehose

Earlier this year, Amazon DynamoDB released Time to Live (TTL) functionality, which automatically deletes expired items from your tables, at no additional cost. TTL eliminates the complexity and cost of scanning tables and deleting items that you don’t want to retain, saving you money on provisioned throughput and storage. One AWS customer, TUNE, purged 85 terabytes of stale data and reduced their costs by over $200K per year.

Today, DynamoDB made TTL better with the release of a new CloudWatch metric for tracking the number of items deleted by TTL, which is also viewable for no additional charge. This new metric helps you monitor the rate of TTL deletions to validate that TTL is working as expected. For example, you could set a CloudWatch alarm to fire if too many or too few automated deletes occur, which might indicate an issue in how you set expiration time stamps for your items.

In this post, I’ll walk through an example of a serverless application using TTL to  automate a common database management task: moving old data from your database into archival storage automatically. Archiving old data helps reduce costs and meet regulatory requirements governing data retention or deletion policies. I’ll show how TTL—combined with DynamoDB Streams, AWS Lambda, and Amazon Kinesis Firehose—facilitates archiving data to a low-cost storage service like Amazon S3, a data warehouse like Amazon Redshift, or to Amazon Elasticsearch Service.,Amazon_DynamoDB&linkId=50979940

Notes on structured concurrency, or: Go statement considered harmful

Every concurrency API needs a way to run code concurrently. Here’s some examples of what that looks like using different APIs:

go myfunc(); // Golang
pthread_create(&thread_id, NULL, &myfunc); /* C with POSIX threads */
spawn(modulename, myfuncname, []) % Erlang
threading.Thread(target=myfunc).start() # Python with threads
asyncio.create_task(myfunc()) # Python with asyncio

There are lots of variations in the notation and terminology, but the semantics are the same: these all arrange for myfunc to start running concurrently to the rest of the program, and then return immediately so that the parent can do other things.

Another option is to use callbacks:

QObject::connect(&emitter, SIGNAL(event()), &receiver, SLOT(myfunc())) // C++ with Qt
g_signal_connect(emitter, "event", myfunc, NULL) /* C with GObject */
document.getElementById("myid").onclick = myfunc; // Javascript
promise.then(myfunc, errorhandler) // Javascript with Promises
deferred.addCallback(myfunc) # Python with Twisted
future.add_done_callback(myfunc) # Python with asyncio

Again, the notation varies, but these all accomplish the same thing: they arrange that from now on, if and when a certain event occurs, then myfunc will run. Then once they’ve set that up, they immediately return so the caller can do other things. (Sometimes callbacks get dressed up with fancy helpers like promise combinators, or Twisted-style protocols/transports, but the core idea is the same.)

And… that’s it. Take any real-world, general-purpose concurrency API, and you’ll probably find that it falls into one or the other of those buckets (or sometimes both, like asyncio).

New AI Imaging Technique Reconstructs Photos with Realistic Results

Researchers from NVIDIA, led by Guilin Liu, introduced a state-of-the-art deep learning method that can edit images or reconstruct a corrupted image, one that has holes or is missing pixels.

The method can also be used to edit images by removing content and filling in the resulting holes.

The method, which performs a process called “image inpainting”, could be implemented in photo editing software to remove unwanted content, while filling it with a realistic computer-generated alternative.

“Our model can robustly handle holes of any shape, size location, or distance from the image borders. Previous deep learning approaches have focused on rectangular regions located around the center of the image, and often rely on expensive post-processing,” the NVIDIA researchers stated in their research paper. “Further, our model gracefully handles holes of increasing size.”

Indexing Metadata in Amazon Elasticsearch Service Using AWS Lambda and Python

You can use Amazon S3 to implement a data lake architecture as the single source of truth for all your data. Taking this approach not only allows you to reliably store massive amounts of data but also enables you to ingest the data at a very high speed and do further analytics on it. Ease of analytics is important because as the number of objects you store increases, it becomes difficult to find a particular object—one needle in a haystack of billions.

Objects in S3 contain metadata that identifies those objects along with their properties. When the number of objects is large, this metadata can be the magnet that allows you to find what you’re looking for. Although you can’t search this metadata directly, you can employ Amazon Elasticsearch Service to store and search all of your S3 metadata. This blog post gives step-by-step instructions about how to store the metadata in Amazon Elasticsearch Service (Amazon ES) using Python and AWS Lambda.,Amazon_DynamoDB&linkId=50822242