Sunday, 28 December 2014

Data Visualization App Using GAE Python, D3.js and Google BigQuery: Part 3

In the previous part of this tutorial, we saw how to get started with D3.js, and created dynamic scales and axes for our visualization graph using a sample dataset. In this part of the tutorial, we'll plot the graph using the sample dataset.
To get started, clone the previous tutorial source code from GitHub.
Navigate to the Google App Engine (GAE) SDK directory and start the server.
Point your browser to http://localhost:8080/displayChart and you should be able to see the X and Y axes that we created in the previous tutorial.
Before getting started, create a new template called displayChart_3.html which will be the same as displayChart.html. Also add a route for displayChart_3.html. This is done just to keep the demo of the previous tutorial intact, since I'll be hosting it on the same URL.
From our sample dataset, we have a number of count to be plotted across a set of corresponding year.
We'll represent each of the data points as circles in our visualization graph. D3.js provides API methods to create various shapes and sizes.
First, we'll use d3.selectAll to select circles inside the visualization element. If no elements are found, it will return an empty placeholder where we can append circles later.
Next, we'll bind our dataset to the circles selection.
Since our existing circle selection is empty, we'll use enter to create new circles.
Next, we'll define certain properties like the distance of the circles' centers from the X (cx) and Y (cy) axes, their color, their radius, etc. For fetching cx and cy, we'll use xScale and yScale to transform the year and count data into the plotting space and draw the circle in the SVG area. Here is how the code will look:
Save changes and refresh your page. You should see the image below:
Chart with circles plotted
In the first part of this series, when we fetched data from BigQuery, we selected some 1,000 words.
We have a dataset which contains a list of all the words which appear across all of Shakespeare's work. So, to make the visualization app reveal some useful information, we'll modify our query to select the number of times a particular word, for example Caesar, appears in Shakespeare's work across different years.
So, log into Google BigQuery and we'll have a screen like the one shown below:
Google BigQuery page showing query history
After we have logged into Google BigQuery, we'll have an interface where we can compose and check our SQL queries. We want to select the number of times a particular word appears across all of Shakespeare's work.
So our basic query would look like this:
The above query gives us a result set as shown below:
Google BigQuery query and result
Let's also include the group of works corresponding to the Word Count. Modify the query as shown to include the corpus:
The resulting result set is shown below:
Google BigQuery result set
Next, open app.py and create a new class called GetChartData. Inside it, include the query statement we created above.
Next, create a BigQuery service against which we'll execute our queryData.
Now, execute the queryData against the BigQuery service and print the result to the page.
Also include a new route for GetChartData as shown.
Finally update the code to the GAE platform.
Point your browser to http://YourAppspotUrl.com/getChartData which should display the resulting data from BigQuery.
Next, we'll try to parse the data received from Google BigQuery and convert it into a JSON data object and pass it to the client side to process using D3.js.
First, we'll check if there are any rows in dataList returned. If no rows, we'll set the response as null or zero.
Next, we'll parse the dataList by looping each row and picking up count, year and corpus and creating our required JSON object.
Since we'll be returning the parsed data as JSON, import the JSON library
And return the created response as a JSON response.
Let's also make the search keyword dynamic, so that it can be passed as a parameter.
Here is how the class GetChartData finally looks:
Update the app into GAE and point your browser to http://YourAppspotUrl.com/getChartData and you can see the returned JSON response.
Returned JSON response
Next, we'll create an interface to query the Google BigQuery dataset from our app dynamically. Open up Templates/displayChart_3.html and include an input box where we'll input keywords to query the dataset.
Include a jQuery script in the page and on the DOM ready event, we'll query the Python method GetChartData on Enter Key press.
Create another function DisplayChart on the client side, inside which we'll make an Ajax call to the Python GetChartData method.
Update the code to GAE and point your browser to http://YourAppspotUrl.com/displayChart3. Enter a keyword, say Caesar, and press Enter. Check your browser console and you should see the returned JSON response.
Next, let's plot the circles using the returned response. So create another JavaScript function called CreateChart. This function is similar to the InitChart function but the data would be passed as parameter. Here is how it looks:
From the InitChart function, remove the circle creation portion since it won't be required now. Here is how InitChart looks:
From now on, when we load the /displayChart3 page, circles won't be displayed. Circles will only appear once the keyword has been searched. So, on the success callback of the DisplayChart Ajax call, pass the response to the CreateChart function.
Update the code to GAE and try searching for the keyword Caesar. OK, so now we get to see the result as circles on the graph. But there is one problem: both the axes get overwritten.
Chart with overlapping axes
So in order to avoid that we'll check inside the CreateChart function if the axes are already there or not.
As you can see, we just checked if the SVG element has axes, and if not we create them again. Update the code to GAE and try searching again for the keyword, and you should see something like this:
Final chart with axes correctly displayed
Although all looks good now, still there are a few issues which we'll address in the next part of this tutorial. We'll also introduce D3.js transitions and a few more features to our D3.js graph, and try to make it more interactive.
The source code from this tutorial is available on GitHub.

No comments:

Post a Comment