MyData, YourData, PyData
< All blog stories

MyData, YourData, PyData

Piramol Krishnan
Author: Piramol Krishnan Data Science Intern at Fabriq July 24, 2019

Data has a better idea

'Is there coffee? Have you seen any coffee? Do you have coffee? WHERE IS THE COFFEE?’

These questions blaze through my mind as I enter the Tower Hotel’s suite, lining up to collect my badge and T-shirt. A sea of people donning the typical programmer apparel –T-shirts, hoodies and jeans—all wait to enter the conference rooms where the first tutorials of the day will begin.

It is a particularly lovely, sunny day to be inside, staring at computer screens. Of course, it could be none other than for PyData that we do this. PyData London is an annual event for beginners, experts and industry professionals alike, where we all convene to listen and learn about the latest developments on the data science frontier. Talks are given by everyone from academics to senior data scientists and cover levels from novice to advanced. Python is one of the most popular programming languages and it is increasingly being used for data science purposes. Its open-source background means that it is continually being improved and there are always people developing dedicated libraries to enhance Python’s features.

piramol blog_python.jpg

The first day of PyData is the Tutorials Day where lectures are run as live demonstrations for everyone to follow along and code with the lecturers. It is particularly crowded in this room, so I find a space on the floor at the back. The lack of coffee becomes unbearable the longer the tutorial goes on, so I sneak out to find some source of caffeine. I find it along with some new friends. We venture forth into the next tutorial, hot drinks and muffins in hand.

piramol blog_speaker2.jpg.png

It is about CatBoost, a specific library made to optimise current Gradient Boosting techniques (an important optimisation algorithm for machine learning). The speaker is sharp, quick and the tutorial is incredibly invigorating; I relished the opportunity to learn something entirely new. She invited us to join the slack for the library but then added the disclaimer that everything would be in Russian. We break for lunch and have another two tutorials. The first day ends on a high note as I leave to enjoy the nice cool breeze by the river.

piramol blog_speaker3.jpg.png

The next two days are markedly more chaotic and crowded. Gone are the tables we had to put our laptops on. In their place are dozens more chairs all squished together to maximise the number of programmers per square area. There is a whole gamut of talks to choose from. At each time slot, there are 3 talks running simultaneously so you can choose which piques your interest. I generally stuck to the data science ones but did mix it up with some other lectures, completely outside my domain of expertise.

piramol blog_speaker4.jpg.png

It is heartening to engage with programmers from all sorts of backgrounds. Finance, astrophysics, academia, all these paths still converging to this one conference where we use the same libraries, software and have a shared appreciation for hoodies as officewear. One of my favourite talks was about Vaex, a library for visualisation and exploration of large datasets. The main speaker and founder of the company that developed the library was affable and one of the slides was simply:

“Don’t do a live demo” – Many people

piramol blog_don&#x27;t.jpg

He then proceeded to do a live demo. It went smoothly, thankfully. During Q&A, he was asked whether the library will continue to be open-source considering that there might be clients who would want specific functionalities to be added that may only be useful to them. He responded that he believed it would always remain open-source and that in general, that is how things should be. Ideology aside, it was also more economical. Closed-source software requires maintenance by a dedicated taskforce whereas open-source software can be maintained by the community that uses it.

“Who would maintain it? Me? Then they would have to pay me again, which is stupid.”

The python community is incredibly vibrant, dynamic and friendly. At certain points during the talks I would find myself a little bit lost in the technicalities of what was being presented but I could always find someone who’d be more than happy to explain it to me. I can only hope that one day, I could be up there delivering a talk with as much grace and humour, and possibly some memes embedded into my presentation as well.