From the Console to Consumers

Index

Links to later posts in this series:

  1. Working with Financial Data
  2. Calling the PyData Stack from a Backend
  3. Development, Staging, and Production
  4. Angular 2 for Data Science
  5. Interactive Plotting in JavaScript

Motivation

A lot of great data science happens in interactive IPython sessions or Jupyter notebooks, allowing for quick analysis and communication of results. However, turning notebooks or Python scripts into web applications can be a daunting task and often leads to great tools getting buried. This series of blog posts will address the gap between homespun data analysis tools and consumer-facing web applications. When a data scientist identifies a consumer-oriented service, this workflow can bring the service to the consumer.

What Are We Building?

We turn to the world of finance for this project because it contains many examples of consumer-facing services that are backed primarily by data analysis. We will take a couple of Python scripts that optimize a user’s stock portfolio and turn them into an interactive web application. Each post in this series will stand alone conceptually, so feel free to skip around. However, be sure to git checkout the correct branch on GitHub before following along.

Project Architecture

In short, we want to set up a sleek frontend web application that can make API calls to the PyData stack, which will crunch data for us and spit back a response. The finished application will eventually be hosted at stocks.coshx.com, and you can follow the development on the project’s GitHub repository. This application takes a list of stocks, start and end dates, and an initial investment amount as inputs and returns the optimal allocation of each stock and a plot of the optimal portfolio’s performance.

  • The Backend: Under the hood, our application will pull data in from Quandl, calculate statistics on the data using numpy, and optimize stock allocations using scipy.optimize.
  • The Client-Server Architecture: Nginx will serve up the frontend JavaScript and make API calls to our backend. We will use Tornado on the backend to call Python scripts and respond to frontend calls.
  • The Frontend: The user-facing side of our application will make calls to the backend and serve up graphics to the user via the Angular 2 web framework. We will visualize data using JavaScript libraries such as d3.js.

Upcoming Posts

Each Friday for the next four weeks, I will roll out a new post in the series. As they are finished, a link will go live on this page.

  1. Working with Financial Data: This post introduces the common difficulties of analyzing financial data and gives an overview of our approach to portfolio optimization.
  2. Calling the PyData Stack from a Backend: Here we describe how to set up a backend server to handle web requests asynchronously and call our Python scripts using Tornado.
  3. Development, Staging, and Production: Here we transition from backend development to setting up a staging server before continuing with frontend development.
  4. Angular 2 for Data Science: We will introduce a general approach to frontend web development and then dive in with the specifics of Angular 2 with TypeScript.
  5. Interactive Plotting in JavaScript: Yes, there are easier ways to make plots (and we will discuss them), but we will use d3.js to visualize our data.

About Me

I am a lab scientist turned data scientist who is interested in machine learning and software development. After writing endless scripts that I couldn’t share easily with others, I wanted to create a streamlined workflow for developing consumer-facing software that hinges on data analysis.