Four of our colleagues of AG Computing had a fantastic time at the PyCon DE & PyData Berlin 2023. During the three days at the conference, a diverse set of topics related to Python and data science was addressed. Here we want to share our experience and highlights of the conference. We did not have the chance to be present in all those fantastic talks, and we look forward to learning more from the conference recordings. As far as the recordings are available, we will update this article accordingly.
After the opening, the first day ever at a PyCon/PyData conference started for me with the great tutorial Accelerate Python with Julia by Stephan Sahm. Here practical examples demonstrate how one can achieve a significant speed-up for your code without too much overhead of work. One highlight of the second day was the talk Visualizing your computer vision data is not a luxury, it’s a necessity: without it, your models are blind and so do you. by Chazareix Arnault. During his talk, he showed fascinating examples of complicated situations in image data sets (such as weird realities). Further, he suggested tools for visualizing image data, such as opencv, Jupyter Widgets, FiftyOne, and Streamlit. The conference concluded with excellent talks about Accelerating Python Code by Jens Nie and The Beauty of ZARR by Sanket Verma. Finally, an inspiring talk was given by Jens Agerberg, who took us to the field of topological data analysis with his talk Teaching Neural Networks a Sense of Geometry.
PyCon DE & PyData Berlin 2023 was a great experience for us as Data Scientists from the GWDG. We had the opportunity to attend a variety of workshops and talks from some of the most influential figures in the world of data science. As a result, we were able to learn about the latest trends and technologies in the field, as well as gain valuable insights from experienced professionals. Additionally, we were able to network with other data scientists and build relationships with potential users.
The event was well-organized, and the atmosphere was buzzing with enthusiasm. We also got hands-on experience with the latest tools and technologies, which was extremely useful. Developers and users of various Python packages presented their experiences with state-of-the-art Big Data and Data Science solutions.
Most exciting takeaways for me:
- Received a detailed overview of Management tools for Python Packages and environments, which is vital for some current projects at AG C
- PyG as a promising tool for building graph neural networks (GNNs) and what the plans of the developers are
- Rust in Python: High potential for speed up among a lot of overhead in engineering and stability problems with state-of-the-art tools.
Overall, the PyCon DE & PyData Berlin 2023 was an enriching experience.
As both an administrator and DevOps engineer, I was particularly excited to learn about the latest advancements in the Python language presented during the conference, as a chance to learn more about them directly from industry professionals. Also, I was of course very interested in the newest tools and applications that were presented and the new development concepts therein.
For that reason, I was very impressed by the keynote Towards Learned Database Systems by Prof. Carsten Binnig that was given on the second day of the conference. In this keynote, he presented his working groups ideas of introducing machine-learning algorithms into the heuristical analysis components of relational-database systems, where they can, for example, be used to dramatically improve calculation speeds of index-building and query-execution-plan calculations. Furthermore, I was very impressed by the talk Specifying behavior with Protocols, Typeclasses or Traits. Who wears it better (Python, Scala 3, Rust)? by Kolja Maier, which was a very interesting introduction into the
typing.Protocol concept of Python and how it relates to Traits in Scala and Rust. It really helped with my understanding of the concept in general and I will surely use this feature in future coding projects. Lastly, the talk Monorepos in Python was especially interesting for the DevOps engineer in me, as it provided a insightful overview over ways to manage monorepos inside a CI/CD pipeline, and offered practical tips on when to use and avoid them.
Overall, the PyCon DE & PyData Berlin 2023 conference was an excellent opportunity to learn and grow, and I am grateful to have attended.
3 days. More than 100 talks, keynotes, workshops and most of all: A deep dive into an huge community of data scientist, data engineers and data-anythings. The PyCon 2023 was amazing to learn about new libaries, new approaches to challenges anyone faces working with data and to network. Among many great talks, some left me re-evaluating and reflecting on a conceptual level:
- How can you ensure observability in distributed computing and avoid inefficient digging through many logs? Observability for Distributed Computing with Dask
- How can you use NLP to improve the documentation in your (even pre-Chat-GTP) “Who is an NLP expert?” – Lessons Learned from building an in-house QA-system
- Can databases be sped-up through neural networks? Towards Learned Database Systems
Overall, it was an inspiring conference and I would recommend anyone to use their coffee-break for browsing through slides, code or listen to the recorded talks – see the links below.
Unordered collection of links from the conference website and Discord
This list needs to be completed as it only captures the links for talks we were present at. This list will grow when we have the chance to watch some more recordings of the talks. If you want to contribute some more links, reach out to email@example.com.
Avanindra Kumar Pandeya: FastAPI and Celery: Building Reliable Web Applications with TDD
Valerio Maggio: Actionable Machine Learning in the Browser with PyScript
Noa Tamir: Keynote – How Are We Managing? Data Teams Management IRL
Guido Imperiale: Data-driven design for the Dask scheduler
samsja: Modern typed python: dive into a mature ecosystem from web dev to machine learning
Yuichiro Tachibana: Streamlit meets WebAssembly – stlite
Christopher Prohm: Pragmatic ways of using Rust in your data project
Noa Tamir, Patrick Hoefler: Let’s contribute to pandas
Thomas Bierhance: Polars – make the switch to lightning-fast dataframes
Lisa Andreevna Chalaguine: How to teach NLP to a newbie & get them started on their first project
newer course from discourd
Susan Shu Chang: Keynote – A journey through 4 industries with Python: Python’s versatile problem-solving toolkit
Cheuk Ting Ho: Driving down the Memray lane – Profiling your data science work
Stephan Sahm: Accelerate Python with Julia
Stephan Sahm: Accelerate Python with Julia
Martin Christen: Geospatial Data Processing in Python: A Comprehensive Tutorial
Marcus Tedesco: Neo4j graph databases for climate policy
KIlian Kluge: Grokking Anchors: Uncovering What a Machine-Learning Model Relies On
Lev Konstantinovskiy: Prompt Engineering 101: Beginner intro to LangChain, the shovel of our ChatGPT gold rush.
Aleksander Molak: The Battle of Giants: Causality vs NLP => From Theory to Practice
Jens Nie: Accelerating Python Code
code / presentation
Jeremy Tuloup: Create interactive Jupyter websites with JupyterLite
Sanket Verma: The Beauty of Zarr