thoughts…

rants and bookmarks about programming stuff…


A study on scripting language APIs

A study on scripting language APIs

“Applications written in two programming languages, in order to optimize parts where performance is critical or to obtain extensibility through user-written scripts, are common-place nowadays. There are several ways to obtain this kind of interoperability; ideally, however, a language should provide a foreign language interface (FLI), allowing programmers to send and receive both data and function calls to code written in another language. This work discusses the main issues involving the design of APIs for integration of language environments within C applications. We present the main problems faced in the interaction between code executed in an environment with inherently dynamic characteristics such as a scripting language and C code. We compare approaches employed by five languages when handling communication between the data spaces of C and embedded runtime environments and the consequences of these approaches in memory management, as well as sharing of code between the C application and that from the scripting language. We illustrate the diferences of the APIs of those languages and their impact in the resulting code of a C application through a case study. Diferent scripting languages were embedded as plugins for a library, which on its turn exposes to client applications a generic scripting API. This way, the code of each plugin allows us to observe in a clear and isolated way the procedures adopted by each language for function calls, registration of C functions and conversion of data between the environments.”


pathoc: break all the Python webservers!

“In this post, I show how pathoc can be used as a very simple fuzzer, by finding issues in a number of major pure-Python webservers. None of the tested servers failed catastrophically – they all caught the unexpected exception and continued serving requests. None the less, I think it’s reasonable to say that we’ve triggered a bug if a) the server returns an 500 Internal Server Error response or terminates the connection abnormally, and b) we see a traceback in our logs. In fact, by this definition, I found bugs in every pure-Python server I tested…”

http://corte.si/posts/code/pathod/pythonservers/index.html

http://pathod.net/docs/pathoc


Havana Design Summit: Benchmarking Swift

“Depending on your goal, you may want a Realistic Benchmark or a Targeted Benchmark. Both approaches require benchmarking tools that scale to avoid any bottlenecks in the benchmarking code during load generation. Because of Swift’s fantastic horizontal scalability, avoiding bottlenecks in benchmarking code can be very challenging. Benchmarking Swift means generating tens of thousands of concurrent requests and utilizing many benchmarking servers to allow hundreds of gigabits per second of available client throughput. Both approaches to benchmarking also benefit from fine-grained collection of total request latency, time-to-first-byte latency, and Swift transaction IDs for every request. But they do have different goals, and that should inform load generation and results analysis.

Realistic Benchmarking, asks, “What happens when the cluster sees a particular client load?” or “How many clients, ops-per-second, or throughput can my cluster really support?” You are more interested in simulating a production workload than you are in isolating a particular action. This kind of benchmarking can benefit from simulating parametric mixed client workloads (proportion of object sizes, operation types, etc.) or replaying a workload based on some kind of capture or “trace” from another cluster.

With Targeted Benchmarking, you want to generate a very specific, controlled load on the cluster to identify problems and test potential improvements. Data collected during a synthetic workload will be less noisy than a more realistic, mixed workload. This is useful for testing the effectiveness of tweaks to networking, node hardware, tuning/configuration, and Swift code…”

http://swiftstack.com/blog/2013/04/18/openstack-summit-benchmarking-swift/


Getting Started with Python for Data Scientists

“With the R Users DC Meetup broadening its topic base to include other statistical programming tools, it seemed only reasonable to write a meta post highlighting some of the best Python tutorials and resources available for data science and statistics. What you don’t know is often the hardest part of picking up a new skill, so hopefully these resources will help make learning Python a little easier. Prepare yourself for code indentation heaven.

Python is such an incredible language because it can do practically anything, from high performance scientific computing to web frameworks such as Django or Flask.  Python is heavily used at Google so the language must be doing something right. And, similar to R, Python has a fantastic community around it and, luckily for you, this community can write. Don’t just take my word for it, watch the following video to fully understand…”

http://datacommunitydc.org/blog/2013/03/getting-started-with-python-for-data-scientists/


Fast PyPy-compatible ordered map in 89 lines of Python

“Below is a Skip List implementation done in Python. The great innovation here is that unlike a dictionary, skip lists keep their items permanently in order, making it possible to efficiently walk the collection backwards and forwards, from start or end, or from an arbitrary key, much more efficiently than would ever be possible with a dict…”

http://pythonsweetness.tumblr.com/post/45227295342/fast-pypy-compatible-ordered-map-in-89-lines-of-python

Follow

Get every new post delivered to your Inbox.

Join 514 other followers