On Incomplete HTTP Reads and the Requests Library In Python

The requests library is arguably the mostly widely used HTTP library for Python. However, what I believe most of its users are not aware of is that its current stable version happily accepts responses whose length is less than what is given in the Content-Length header. If you are not careful enough to check this by yourself, you may end up using corrupted data without even noticing. I have witnessed this first-hand, which is the reason for the present blog post. Lets see why the current requests version does not do this checking (spoiler: it is a feature, not a bug) and how to check this manually in your scripts.

https://blog.petrzemek.net/2018/04/22/on-incomplete-http-reads-and-the-requests-library-in-python/

Advertisements

Integration layer between Requests and Selenium for automation of web actions

Requestium is a python library that merges the power of Requests, Selenium, and Parsel into a single integrated tool for automatizing web actions.

The library was created for writing web automation scripts that are written using mostly Requests but that are able to seamlessly switch to Selenium for the JavaScript heavy parts of the website, while maintaining the session.

Requestium adds independent improvements to both Requests and Selenium, and every new feature is lazily evaluated, so its useful even if writing scripts that use only Requests or Selenium.

Features

  • Enables switching between a Requests’ Session and a Selenium webdriver while maintaining the current web session.
  • Integrates Parsel’s parser into the library, making xpath, css, and regex much cleaner to write.
  • Improves Selenium’s handling of dynamically loading elements.
  • Makes cookie handling more flexible in Selenium.
  • Makes clicking elements in Selenium more reliable.
  • Supports Chrome and PhantomJS.

https://github.com/tryolabs/requestium

Node.js Express API Development Security Checklist

The folks at RisingStack have published a really good article on security in Node.js applications and this checklist is meant to complement it with specifics for API development using the express framework.

  • [ ] Secure headers: use helmet, especially to set the Strict Transport Security header which will keep all your connections on HTTPS. Also see here on how to setup https using a free certificate from letsencrypt.
  • [ ] Log all errors but don’t expose stacktraces to the client.
  • [ ] Rate limit api calls to protect against DoS attacks. Can use expres-rate-limit.
  • Sanitize all user input
    • [ ] Sql injection: use prepared statements in favor of concatenating user input. For e.g.
      app.get('/', function(req, res) {
        Promise.using(getSqlConnection(), function(connection) {
          var sql = 'SELECT * from users where id = "' + req.query.username + '"';
          return connection.queryAsync(sql, [id])
            .then(function(rows, cols) {
              return rows;
            });
        });
      });

      can be hijacked to /?username=anything%22%20OR%20%22x%22%3D%22x which results in the following sql query being executed: select * from users where id = "anything" OR "x"="x". This will always result in true and return data for all the users in the system. This can be further extended to cause a lot more damage.

    • [ ] XSS: prevent the ability of an attacker to inject arbitary code into your application by sanitizing user input. For e.g. the following end point which accepts user input
      app.get('/', function(req, res) {
        var html = 'Hello ' + req.query.username;
        res.send(html);
      });

      can then be hijacked to create a url as follows /?username=%3Cbody%20onload%3Dalert(%27test1%27)%3E. This link can then be sent to unsuspecting users of your website and have arbitary code being executed on their machine. See here for more types of XSS attacks and examples.

    • [ ] Command injection: for example, a url like https://example.com/downloads?file=user1.txt could be turned into https://example.com/downloads?file=%3Bcat%20/etc/passwd.
    • [ ] MongoDb query injection: similar to sql injection but using MongoDb’s special operators instead. As an example consider the following end point
      app.post('/', function (req, res) {
        db.users.find({username: req.body.username, password: req.body.password}, function (err, users) {
            // TODO: handle the rest
        });
      });

      where sending in

      POST http://target/ HTTP/1.1
      Content-Type: application/json
      
      {
          "username": "vic@smalldata.tech",
          "password": {"$gt": ""}
      }
      

      will result in a successful match. Use mongo-express-sanitize to sanitize all user input.

    • [ ] Regex Denial of Service: a situation where user inputted regex can lead to blocking the event loop and a hanging application. See here for examples.
  • [ ] Use TLS for all connections. Also see here on how to setup https using a free certificate from letsencrypt.
  • [ ] Keep dependencies updated to stay ahead of any security issues. Use nsp to check dependencies for security vulnerabilities. Another great platform for open source projects is snyk.io.
  • [ ] Check for permissions at every step of the API chain: for e.g. GET /users/:userId/contacts/:contactId should not assume that the userId authenticated for the request is also authorized to make this call. Check that request.params.userId === request.authenticatedUserId or isAuthorized(authenticatedUserId, {userId: authenticatedUserId, resource: 'CONTACTS'}.
  • [ ] Don’t block the event loop: as an example parsing json is not a free operation and can potentially block the event loop for large json files (> 1Mb). Note that using the bodyparser module globally will give you a default maximum of 100kb for json payloads. It is efficient to only use it for routes which require it.

Please note that this checklist is meant to be used as a reference for further study. It is by no means an exhaustive list of all potential security issues. See also the web developer security checklist. Additions and comments are welcome.

https://smalldata.tech/blog/2017/05/19/nodejs-express-api-development-security-checklist

Building Business Systems with Domain-Specific Languages for NGINX & OpenResty

This post is adapted from a presentation at nginx.conf 2016 by Yichun Zhang, Founder and CEO of OpenResty, Inc. This is the first of two parts of the adaptation. In this part, Yichun describes OpenResty’s capabilities and goes over web application use cases built atop OpenResty. In Part 2, Yichun looks at what a domain-specific language is in more detail.

You can view the complete presentation on YouTube.

https://www.nginx.com/blog/building-business-systems-with-domain-specific-languages-for-nginx-openresty-part-1/
https://www.nginx.com/blog/building-business-systems-with-domain-specific-languages-for-nginx-openresty-part-2/

How to make complex requests simple with RxJava in Kotlin

It is a common problem in Android development when your API is not sending you exactly the same data, what you want to show in your views, so you need to implement more complex requests. Possibly your app needs to make multiple requests, that wait for each other, or call multiple requests after the previous one finished. Sometimes you even need to combine these two approaches. This can be challenging in plain Java and will often result in unreadable code, what is also painful to test.

Today I’m going to show you in a simple example how this can be achieved in a clean way using RxJava. The example is written in Kotlin, what makes the code more concise and easy to read. If you are completely new to RxJava or Kotlin, I suggest you catch up on the basics. There are some great resources here as well.

https://blog.mindorks.com/how-to-make-complex-requests-simple-with-rxjava-in-kotlin-ccec004c5d10

What you should know to really understand the Node.js Event Loop

Node.js is an event-based platform. This means that everything that happens in Node is the reaction to an event. A transaction passing through Node traverses a cascade of callbacks.
Abstracted away from the developer, this is all handled by a library called libuv which provides a mechanism called an event loop.

This event loop is maybe the most misunderstood concept of the platform.

I work for Dynatrace, a performance monitoring vendor and when we approached the topic of event loop monitoring, we put a lot of effort into properly understanding what we are actually measuring.

In this article I will cover our learnings about how the event loop really works and how to monitor it properly.

https://medium.com/the-node-js-collection/what-you-should-know-to-really-understand-the-node-js-event-loop-and-its-metrics-c4907b19da4c