GWDG @ PyCon DE & PyData Berlin 2023

Four of our colleagues of AG Computing had a fantastic time at the PyCon DE & PyData Berlin 2023. During the three days at the conference, a diverse set of topics related to Python and data science was addressed. Here we want to share our experience and highlights of the conference. We did not have the chance to be present in all those fantastic talks, and we look forward to learning more from the conference recordings. As far as the recordings are available, we will update this article accordingly.

Hauke

After the opening, the first day ever at a PyCon/PyData conference started for me with the great tutorial Accelerate Python with Julia by Stephan Sahm. Here practical examples demonstrate how one can achieve a significant speed-up for your code without too much overhead of work. One highlight of the second day was the talk Visualizing your computer vision data is not a luxury, it’s a necessity: without it, your models are blind and so do you. by Chazareix Arnault. During his talk, he showed fascinating examples of complicated situations in image data sets (such as weird realities). Further, he suggested tools for visualizing image data, such as opencv, Jupyter Widgets, FiftyOne, and Streamlit. The conference concluded with excellent talks about Accelerating Python Code by Jens Nie and The Beauty of ZARR by Sanket Verma. Finally, an inspiring talk was given by Jens Agerberg, who took us to the field of topological data analysis with his talk Teaching Neural Networks a Sense of Geometry.

Tino

PyCon DE & PyData Berlin 2023 was a great experience for us as Data Scientists from the GWDG. We had the opportunity to attend a variety of workshops and talks from some of the most influential figures in the world of data science. As a result, we were able to learn about the latest trends and technologies in the field, as well as gain valuable insights from experienced professionals. Additionally, we were able to network with other data scientists and build relationships with potential users.

The event was well-organized, and the atmosphere was buzzing with enthusiasm. We also got hands-on experience with the latest tools and technologies, which was extremely useful. Developers and users of various Python packages presented their experiences with state-of-the-art Big Data and Data Science solutions.

Most exciting takeaways for me:

Received a detailed overview of Management tools for Python Packages and environments, which is vital for some current projects at AG C
PyG as a promising tool for building graph neural networks (GNNs) and what the plans of the developers are
Rust in Python: High potential for speed up among a lot of overhead in engineering and stability problems with state-of-the-art tools.

Overall, the PyCon DE & PyData Berlin 2023 was an enriching experience.

Timon

As both an administrator and DevOps engineer, I was particularly excited to learn about the latest advancements in the Python language presented during the conference, as a chance to learn more about them directly from industry professionals. Also, I was of course very interested in the newest tools and applications that were presented and the new development concepts therein.

For that reason, I was very impressed by the keynote Towards Learned Database Systems by Prof. Carsten Binnig that was given on the second day of the conference. In this keynote, he presented his working groups ideas of introducing machine-learning algorithms into the heuristical analysis components of relational-database systems, where they can, for example, be used to dramatically improve calculation speeds of index-building and query-execution-plan calculations. Furthermore, I was very impressed by the talk Specifying behavior with Protocols, Typeclasses or Traits. Who wears it better (Python, Scala 3, Rust)? by Kolja Maier, which was a very interesting introduction into the typing.Protocol concept of Python and how it relates to Traits in Scala and Rust. It really helped with my understanding of the concept in general and I will surely use this feature in future coding projects. Lastly, the talk Monorepos in Python was especially interesting for the DevOps engineer in me, as it provided a insightful overview over ways to manage monorepos inside a CI/CD pipeline, and offered practical tips on when to use and avoid them.

Overall, the PyCon DE & PyData Berlin 2023 conference was an excellent opportunity to learn and grow, and I am grateful to have attended.

Dorothea

3 days. More than 100 talks, keynotes, workshops and most of all: A deep dive into an huge community of data scientist, data engineers and data-anythings. The PyCon 2023 was amazing to learn about new libaries, new approaches to challenges anyone faces working with data and to network. Among many great talks, some left me re-evaluating and reflecting on a conceptual level:

How can you ensure observability in distributed computing and avoid inefficient digging through many logs? Observability for Distributed Computing with Dask
How can you use NLP to improve the documentation in your (even pre-Chat-GTP) “Who is an NLP expert?” – Lessons Learned from building an in-house QA-system
Can databases be sped-up through neural networks? Towards Learned Database Systems

Overall, it was an inspiring conference and I would recommend anyone to use their coffee-break for browsing through slides, code or listen to the recorded talks – see the links below.

Unordered collection of links from the conference website and Discord

Not for all talks enough seats were available. Luckily cozy chairs with a great view were available. Source: Hauke Kirchner

This list needs to be completed as it only captures the links for talks we were present at. This list will grow when we have the chance to watch some more recordings of the talks. If you want to contribute some more links, reach out to hauke.kirchner@gwdg.de.

Avanindra Kumar Pandeya: FastAPI and Celery: Building Reliable Web Applications with TDD

Description
slides and code

Valerio Maggio: Actionable Machine Learning in the Browser with PyScript

description
slides

Noa Tamir: Keynote – How Are We Managing? Data Teams Management IRL

Description
slides

Guido Imperiale: Data-driven design for the Dask scheduler

description
slides

samsja: Modern typed python: dive into a mature ecosystem from web dev to machine learning

Decription
slides

Yuichiro Tachibana: Streamlit meets WebAssembly – stlite

description
slides