Analysis of streaming data in real time has long been the domain of big data frameworks, predominantly written in Java. In order to take advantage of those capabilities from Python requires using client libraries that suffer from impedance mis-matches that make the work harder than necessary. Bytewax is a new open source platform for writing stream processing applications in pure Python that don’t have to be translated into foreign idioms. In this episode Bytewax founder Zander Matheson explains how the system works and how to get started with it today.
Static typing versus dynamic typing is one of the oldest debates in software development. In recent years a number of dynamic languages have worked toward a middle ground by adding support for type hints. Python's type annotations have given rise to an ecosystem of tools that use that type information to validate the correctness of programs and help identify potential bugs. At Instagram they created the Pyre project with a focus on speed to allow for scaling to huge Python projects. In this episode Shannon Zhu discusses how it is implemented, how to use it in your development process, and how it compares to other type checkers in the Python ecosystem.
19 September 2022 •
Every software project is subject to a series of decisions and tradeoffs. One of the first decisions to make is which programming language to use. For companies where their product is software, this is a decision that can have significant impact on their overall success. In this episode Sean Knapp discusses the languages that his team at Ascend use for building a service that powers complex and business critical data workflows. He also explains his motivation to standardize on Python for all layers of their system to improve developer productivity.
13 September 2022 •
Writing code is only one piece of creating good software. Code reviews are an important step in the process of building applications that are maintainable and sustainable. In this episode On Freund shares his thoughts on the myriad purposes that code reviews serve, as well as exploring some of the patterns and anti-patterns that grow up around a seemingly simple process.
5 September 2022 •
Quality assurance in the software industry has become a shared responsibility in most organizations. Given the rapid pace of development and delivery it can be challenging to ensure that your application is still working the way it's supposed to with each release. In this episode Jonathon Wright discusses the role of quality assurance in modern software teams and how automation can help.
28 August 2022 •
The goal of every software team is to get their code into production without breaking anything. This requires establishing a repeatable process that doesn't introduce unnecessary roadblocks and friction. In this episode Ronak Rahman discusses the challenges that development teams encounter when trying to build and maintain velocity in their work, the role that access to infrastructure plays in that process, and how to build automation and guardrails for everyone to take part in the delivery process.
14 August 2022 •
Every startup begins with an idea, but that won't get you very far without testing the feasibility of that idea. A common practice is to build a Minimum Viable Product (MVP) that addresses the problem that you are trying to solve and working with early customers as they engage with that MVP. In this episode Tony Pavlovych shares his thoughts on Python's strengths when building and launching that MVP and some of the potential pitfalls that businesses can run into on that path.
31 July 2022 •
Application architectures have been in a constant state of evolution as new infrastructure capabilities are introduced. Virtualization, cloud, containers, mobile, and now web assembly have each introduced new options for how to build and deploy software. Recognizing the transformative potential of web assembly, Matt Butcher and his team at Fermyon are investing in tooling and services to improve the developer experience. In this episode he explains the opportunity that web assembly offers to all language communities, what they are building to power lightweight server-side microservices, and how Python developers can get started building and contributing to this nascent ecosystem.
25 July 2022 •
As your code scales beyond a trivial level of complexity and sophistication it becomes difficult or impossible to know everything that it is doing. The flow of logic and data through your software and which parts are taking the most time are impossible to understand without help from your tools. VizTracer is the tool that you will turn to when you need to know all of the execution paths that are being exercised and which of those paths are the most expensive. In this episode Tian Gao explains why he created VizTracer and how you can use it to gain a deeper familiarity with the code that you are responsible for maintaining.
17 July 2022 •
3 July 2022 •
Virtually everything that you interact with on a daily basis and many other things that make modern life possible were designed and modeled in software called CAD or Computer-Aided Design. These programs are advanced suites with graphical editing environments tailored to domain experts in areas such as mechanical engineering, electrical engineering, architecture, etc. While the UI-driven workflow is more accessible, it isn't scalable which opens the door to code-driven workflows. In this episode Jeremy Wright discusses the design, uses, and benefits of the CadQuery framework for building 3D CAD models entirely in Python.
27 June 2022 •
Building any software project is going to require relying on dependencies that you and your team didn't write or maintain, and many of those will have dependencies of their own. This has led to a wide variety of potential and actual issues ranging from developer ergonomics to application security. In order to provide a higher degree of confidence in the optimal combinations of direct and transitive dependencies a team at Red Hat started Project Thoth. In this episode Fridolín Pokorný explains how the Thoth resolver uses multiple signals to find the best combination of dependency versions to ensure compatibility and avoid known security issues.
15 June 2022 •
Most developers have encountered code completion systems and rely on them as part of their daily work. They allow you to stay in the flow of programming, but have you ever stopped to think about how they work? In this episode Meredydd Luff takes us behind the scenes to dig into the mechanics of code completion engines and how you can customize them to fit your particular use case.
30 May 2022 •
Russell Keith-Magee is an accomplished engineer and a fixture of the Python community. His work on the Beeware suite of projects is one of the most ambitious undertakings in the ecosystem and unfailingly forward-looking. With his recent transition to working for Anaconda he is now able to dedicate his full focus to the effort. In this episode he reflects on the journey that he has taken so far, how Beeware is helping to address some of the threats to Python's long term viability, and how he envisions its future in light of the recent release of PyScript, an in-browser runtime for Python.
24 May 2022 •
Digital cameras and the widespread availability of smartphones has allowed us all to generate massive libraries of personal photographs. Unfortunately, now we are all left to our own devices of how to manage them. While cloud services such as iPhotos and Google Photos are convenient, they aren't always affordable and they put your pictures under the control of large companies with their own agendas. LibrePhotos is an open source and self-hosted alternative to these services that puts you in control of your digital memories. In this episode the maintainer of LibrePhotos, Niaz Faridani-Rad, explains how he got involved with the project, the capabilities that it offers for managing your image library, and how to get your own instance set up to take back control of your pictures.
16 May 2022 •
Investing effectively is largely a game of information access and analysis. This can involve a substantial amount of research and time spent on finding, validating, and acquiring different information sources. In order to reduce the barrier to entry and provide a powerful framework for amateur and professional investors alike Didier Rodrigues Lopes created the OpenBB Terminal. In this episode he explains how a pandemic project that started as an experiment has led to him founding a new company and dedicating his time to growing and improving the project and its community.
10 May 2022 •
The experimentation phase of building a machine learning model requires a lot of trial and error. One of the limiting factors of how many experiments you can try is the length of time required to train the model which can be on the order of days or weeks. To reduce the time required to test different iterations Rolando Garcia Sanchez created FLOR which is a library that automatically checkpoints training epochs and instruments your code so that you can bypass early training cycles when you want to explore a different path in your algorithm. In this episode he explains how the tool works to speed up your experimentation phase and how to get started with it.
2 May 2022 •
Programmers love to automate tedious processes, including refactoring your code. In order to support the creation of code modifications for your Python projects Jimmy Lai created LibCST. It provides a richly typed and high level API for creating and manipulating concrete syntax trees of your source code. In this episode Jimmy Lai and Zsolt Dollenstein explain how it works, some of the linting and automatic code modification utilities that you can build with it and how to get started with using it to maintain your own Python projects.
25 April 2022 •
Communication is a fundamental requirement for any program or application. As the friction involved in deploying code has gone down, the motivation for architecting your system as microservices goes up. This shifts the communication patterns in your software from function calls to network calls. In this episode Idit Levine explains how the Gloo platform that she and her team at Solo have created makes it easier for you to configure and monitor the network topologies for your microservice environments. She also discusses what developers need to know about networking in cloud native environments and how a combination of API gateways and service mesh technologies allow you to more rapidly iterate on your systems.
19 April 2022 •
Cloud native architectures have been gaining prominence for the past few years due to the rising popularity of Kubernetes. This introduces new complications to development workflows due to the need to integrate with multiple services as you build new components for your production systems. In order to reduce the friction involved in developing applications for cloud native environments Michael Schilonka created Gefyra. In this episode he explains how it connects your local machine to a running Kubernetes environment so that you can rapidly iterate on your software in the context of the whole system. He also shares how the Django Hurricane plugin lets your applications work closely with the Kubernetes process model.
11 April 2022 •
Science is founded on the collection and analysis of data. For disciplines that rely on data about the earth the ability to simulate and generate that data has been growing faster than the tools for analysis of that data can keep up with. In order to help scale that capacity for everyone working in geosciences the Pangeo project compiled a reference stack that combines powerful tools into an out-of-the-box solution for researchers to be productive in short order. In this episode Ryan Abernathy and Joe Hamman explain what the Pangeo project really is, how they have integrated a combination of XArray, Dask, and Jupyter to power these analytical workflows, and how it has helped to accelerate research on multidimensional geospatial datasets.
28 March 2022 •
A common piece of advice when starting anything new is to "begin with the end in mind". In order to help the engineers at Wayfair manage the complete lifecycle of their applications Joshua Woodward runs a team that provides tooling and assistance along every step of the journey. In this episode he shares some of the lessons and tactics that they have developed while assisting other engineering teams with starting, deploying, and sunsetting projects. This is an interesting look at the inner workings of large organizations and how they invest in the scaffolding that supports their myriad efforts.
20 March 2022 •
Kubernetes is a framework that aims to simplify the work of running applications in production, but it forces you to adopt new patterns for debugging and resolving issues in your systems. Robusta is aimed at making that a more pleasant experience for developers and operators through pre-built automations, easy debugging, and a simple means of creating your own event-based workflows to find, fix, and alert on errors in production. In this episode Natan Yellin explains how the project got started, how it is architected and tested, and how you can start using it today to keep your Python projects running reliably.
14 March 2022 •
Building a machine learning application is inherently complex. Once it becomes necessary to scale the operation or training of the model, or introduce online re-training the process becomes even more challenging. In order to reduce the operational burden of AI developers Robert Nishihara helped to create the Ray framework that handles the distributed computing aspects of machine learning operations. To support the ongoing development and simplify adoption of Ray he co-founded Anyscale. In this episode he re-joins the show to share how the project, its community, and the ecosystem around it have grown and evolved over the intervening two years. He also explains how the techniques and adoption of machine learning have influenced the direction of the project.
6 March 2022 •
As software projects grow and change it can become difficult to keep track of all of the logical flows. By visualizing the interconnections of function definitions, classes, and their invocations you can speed up the time to comprehension for newcomers to a project, or help yourself remember what you worked on last month. In this episode Scott Rogowski shares his work on Code2Flow as a way to generate a call graph of your programs. He explains how it got started, how it works, and how you can start using it to understand your Python, Ruby, and PHP projects.
28 February 2022 •
One of the most persistent challenges faced by organizations of all sizes is the recording and distribution of institutional knowledge. In technical teams this is exacerbated by the need to incorporate technical review feedback and manage access to data before publishing. When faced with this problem as an early data scientist at AirBnB, Chetan Sharma helped create the Knowledge Repo project as a solution. In this episode he shares the story behind its creation and growth, how and why it was released as open source, and the features that make it a compelling option for your own team's knowledge management journey.
21 February 2022 •
Software development is a complex undertaking due to the number of options available and choices to be made in every stage of the lifecycle. In order to make it more scaleable it is necessary to establish common practices and patterns and introduce strong opinions. One area that can have a huge impact on the productivity of the engineers engaged with a project is the tooling used for building, validating, and deploying changes introduced to the software. In this episode maintainers of the Pants build tool Eric Arellano, Stu Hood, and Andreas Stenius discuss the recent updates that add support for more languages, efforts made to simplify its adoption, and the growth of the community that uses it. They also explore how using Pants as the single entry point for all of your routine tasks allows you to spend your time on the decisions that matter.
14 February 2022 •
It doesn't matter how amazing your application is if you are unable to deliver it to your users. Frustrated with the rampant complexity involved in building and deploying software Vlad A. Ionescu created the Earthly tool to reduce the toil involved in creating repeatable software builds. In this episode he explains the complexities that are inherent to building software projects and how he designed the syntax and structure of Earthly to make it easy to adopt for developers across all language environments. By adopting Earthly you can use the same techniques for building on your laptop and in your CI/CD pipelines.
6 February 2022 •
The process of getting software delivered to an environment where users can interact with it requires many steps along the way. In some cases the journey can require a large number of interdependent workflows that need to be orchestrated across technical and organizational boundaries, making it difficult to know what the current status is. Faced with such a complex delivery workflow the engineers at Ericsson created a message based protocol and accompanying tooling to let the various actors in the process provide information about the events that happened across the different stages. In this episode Daniel Ståhl and Magnus Bäck explain how the Eiffel protocol allows you to build a tooling agnostic visibility layer for your software delivery process, letting you answer all of your questions about what is happening between writing a line of code and your users executing it.
31 January 2022 •
When we are creating applications we spend a significant amount of effort on optimizing the experience of our end users to ensure that they are able to complete the tasks that the system is intended for. A similar effort that we should all consider is optimizing the developer experience for ourselves and other engineers who contribute to the projects that we work on. Adam Johnson recently wrote a book on how to improve the developer experience for Django projects and in this episode he shares some of the insights that he has gained through that project and his work with clients to help you improve the experience that you and your team have when collaborating on software development.
24 January 2022 •
Pandas has grown to be a ubiquitous tool for working with data at every stage. It has become so well known that many people learn Python solely for the purpose of using Pandas. With all of this activity and the long history of the project it can be easy to find misleading or outdated information about how to use it. In this episode Matt Harrison shares his work on the book "Effective Pandas" and some of the best practices and potential pitfalls that you should know for applying Pandas in your own work.
15 January 2022 •
Developers hate wasting effort on manual processes when we can write code to do it instead. Cog is a tool to manage the work of automating the creation of text inside another file by executing arbitrary Python code. In this episode Ned Batchelder shares the story of why he created Cog in the first place, some of the interesting ways that he uses it in his daily work, and the unique challenges of maintaining a project with a small audience and a well defined scope.
13 January 2022 •
Statistical regression models are a staple of predictive forecasts in a wide range of applications. In this episode Matthew Rudd explains the various types of regression models, when to use them, and his work on the book "Regression: A Friendly Guide" to help programmers add regression techniques to their toolbox.
2 January 2022 •
Every software project needs a tool for managing the repetitive tasks that are involved in building, running, and deploying the code. Frustrated with the limitations of tools like Make, Scons, and others Eduardo Schettino created doit to handle task automation in his own work and released it as open source. In this episode he shares the story behind the project, how it is implemented under the hood, and how you can start using it in your own projects to save you time and effort.
27 December 2021 •
Whether we like it or not, advertising is a common and effective way to make money on the internet. In order to support the work being done at Read The Docs they decided to include advertisements on the documentation sites they were hosting, but they didn't want to alienate their users or collect unnecessary information. In this episode David Fischer explains how they built the Ethical Ads network to solve their problem, the technical and business challenges that are involved, and the open source application that they built to power their network.
20 December 2021 •
Podcasts are one of the few mediums in the internet era that are still distributed through an open ecosystem. This has a number of benefits, but it also brings the challenge of making it difficult to find the content that you are looking for. Frustrated by the inability to pick and choose single episodes across various shows for his listening Wenbin Fang started the Listen Notes project to fulfill his own needs. He ended up turning that project into his full time business which has grown into the most full featured podcast search engine on the market. In this episode he explains how he build the Listen Notes application using Python and Django, his work to turn it into a sustainable business, and the various ways that you can build other applications and experiences on top of his API.
12 December 2021 •
Outer space holds a deep fascination for people of all ages, and the key principle in its exploration both near and far is orbital mechanics. Poliastro is a pure Python package for exploring and simulating orbit calculations. In this episode Juan Luis Cano Rodriguez shares the story behind the project, how you can use it to learn more about space travel, and some of the interesting projects that have used it for planning planetary and interplanetary missions.
27 November 2021 •
Deep learning frameworks encourage you to focus on the structure of your model ahead of the data that you are working with. Ludwig is a tool that uses a data oriented approach to building and training deep learning models so that you can experiment faster based on the information that you actually have, rather than spending all of our time manipulating features to make them match your inputs. In this episode Travis Addair explains how Ludwig is designed to improve the adoption of deep learning for more companies and a wider range of users. He also explains how the Horovod framework plugs in easily to allow for scaling your training workflow from your laptop out to a massive cluster of servers and GPUs. The combination of these tools allows for a declarative workflow that starts off easy but gives you full control over the end result.
22 November 2021 •
A lot of time and energy goes into data analysis and machine learning projects to address various goals. Most of the effort is focused on the technical aspects and validating the results, but how much time do you spend on considering the experience of the people who are using the outputs of these projects? In this episode Benn Stancil explores the impact that our technical focus has on the perceived value of our work, and how taking the time to consider what the desired experience will be can lead us to approach our work more holistically and increase the satisfaction of everyone involved.
22 November 2021 •
The true power of artificial intelligence is its ability to work collaboratively with humans. Nate Joens co-founded Structurely to create a conversational AI platform that augments human sales teams to help guide potential customers through the initial steps of the funnel. In this episode he discusses the technical and social considerations that need to be combined for a seamless conversational experience and how he and his team are tackling the problem.
6 November 2021 •
Every machine learning model has to start with feature engineering. This is the process of combining input variables into a more meaningful signal for the problem that you are trying to solve. Many times this process can lead to duplicating code from previous projects, or introducing technical debt in the form of poorly maintained feature pipelines. In order to make the practice more manageable Soledad Galli created the feature-engine library. In this episode she explains how it has helped her and others build reusable transformations that can be applied in a composable manner with your scikit-learn projects. She also discusses the importance of understanding the data that you are working with and the domain in which your model will be used to ensure that you are selecting the right features.
31 October 2021 •
The speed of Python is a subject of constant debate, but there is no denying that for compute heavy work it is not the optimal tool. Rather than rewriting your data oriented applications, or having to rearchitect them, the team at Bodo wrote a compiler that will do the optimization for you. In this episode Ehsan Totoni explains how they are able to translate pure Python into massively parallel processes that are optimized for high performance compute systems.
25 October 2021 •
The world of finance has driven the development of many sophisticated techniques for data analysis. In this episode Paul Stafford shares his experiences working in the realm of risk management for financial exchanges. He discusses the types of risk that are involved, the statistical methods that he has found most useful for identifying strategies to mitigate that risk, and the software libraries that have helped him most in his work.
16 October 2021 •
Machine learning and deep learning techniques are powerful tools for a large and growing number of applications. Unfortunately, it is difficult or impossible to understand the reasons for the answers that they give to the questions they are asked. In order to help shine some light on what information is being used to provide the outputs to your machine learning models Scott Lundberg created the SHAP project. In this episode he explains how it can be used to provide insight into which features are most impactful when generating an output, and how that insight can be applied to make more useful and informed design choices. This is a fascinating and important subject and this episode is an excellent exploration of how to start addressing the challenge of explainability.
9 October 2021 •
Finding new and effective treatments for disease is a complex and time consuming endeavor, requiring a high degree of domain knowledge and specialized equipment. Combining his expertise in machine learning and graph algorithms with is interest in drug discovery Jian Tang created the TorchDrug project to help reduce the amount of time needed to find new candidate molecules for testing. In this episode he explains how the project is being used by machine learning researchers and biochemists to collaborate on finding effective treatments for real-world diseases.
30 September 2021 •
The overwhelming growth of smartphones, smart speakers, and spoken word content has corresponded with increasingly sophisticated machine learning models for recognizing speech content in audio data. Dylan Fox founded Assembly to provide access to the most advanced automated speech recognition models for developers to incorporate into their own products. In this episode he gives an overview of the current state of the art for automated speech recognition, the varying requirements for accuracy and speed of models depending on the context in which they are used, and what is required to build a special purpose model for your own ASR applications.
26 September 2021 •
Reinforcement learning is a branch of machine learning and AI that has a lot of promise for applications that need to evolve with changes to their inputs. To support the research happening in the field, including applications for robotics, Carlo D'Eramo and Davide Tateo created MushroomRL. In this episode they share how they have designed the project to be easy to work with, so that students can use it in their study, as well as extensible so that it can be used by businesses and industry professionals. They also discuss the strengths of reinforcement learning, how to design problems that can leverage its capabilities, and how to get started with MushroomRL for your own work.
19 September 2021 •
A perennial problem of doing data science is that it works great on your laptop, until it doesn't. Another problem is being able to recreate your environment to collaborate on a problem with colleagues. Saturn Cloud aims to help with both of those problems by providing an easy to use platform for creating reproducible environments that you can use to build data science workflows and scale them easily with a managed Dask service. In this episode Julia Signall, head of open source at Saturn Cloud, explains how she is working with the product team and PyData community to reduce the points of friction that data scientists encounter as they are getting their work done.
10 September 2021 •
You've got a machine learning model trained and running in production, but that's only half of the battle. Are you certain that it is still serving the predictions that you tested? Are the inputs within the range of tolerance that you designed? Monitoring machine learning products is an essential step of the story so that you know when it needs to be retrained against new data, or parameters need to be adjusted. In this episode Emeli Dral shares the work that she and her team at Evidently are doing to build an open source system for tracking and alerting on the health of your ML products in production. She discusses the ways that model drift can occur, the types of metrics that you need to track, and what to do when the health of your system is suffering. This is an important and complex aspect of the machine learning lifecycle, so give it a listen and then try out Evidently for your own projects.
3 September 2021 •
Building a machine learning model is a process that requires a lot of iteration and trial and error. For certain classes of problem a large portion of the searching and tuning can be automated. This allows data scientists to focus their time on more complex or valuable projects, as well as opening the door for non-specialists to experiment with machine learning. Frustrated with some of the awkward or difficult to use tools for AutoML, Angela Lin and Jeremy Shih helped to create the EvalML framework. In this episode they share the use cases for automated machine learning, how they have designed the EvalML project to be approachable, and how you can use it for building and training your own models.
25 August 2021 •
Data scientists are tasked with answering challenging questions using data that is often messy and incomplete. Anaconda is on a mission to make the lives of data professionals more manageable through creation and maintenance of high quality libraries and frameworks, the distribution of an easy to use Python distribution and package ecosystem, and high quality training material. In this episode Kevin Goldsmith, CTO of Anaconda, discusses the technical and social challenges faced by data scientists, the ways that the Python ecosystem has evolved to help address those difficulties, and how Anaconda is engaging with the community to provide high quality tools and education for this constantly changing practice.
19 August 2021 •
Analysing networks is a growing area of research in academia and industry. In order to be able to answer questions about large or complex relationships it is necessary to have fast and efficient algorithms that can process the data quickly. In this episode Eugenio Angriman discusses his contributions to the NetworKit library to provide an accessible interface for these algorithms. He shares how he is using NetworKit for his own research, the challenges of working with large and complex networks, and the kinds of questions that can be answered with data that fits on your laptop.
15 August 2021 •
Building a software-as-a-service (SaaS) business is a fairly well understood pattern at this point. When the core of the service is a set of machine learning products it introduces a whole new set of challenges. In this episode Dylan Fox shares his experience building Assembly AI as a reliable and affordable option for automatic speech recognition that caters to a developer audience. He discusses the machine learning development and deployment processes that his team relies on, the scalability and performance considerations that deep learning models introduce, and the user experience design that goes into building for a developer audience. This is a fascinating conversation about a unique cross-section of considerations and how Dylan and his team are building an impressive and useful service.
4 August 2021 •
SQL has gone through many cycles of popularity and disfavor. Despite its longevity it is objectively challenging to work with in a collaborative and composable manner. In order to address these shortcomings and build a new interface for your database oriented workloads Erez Shinan created Preql. It is based on the same relational algebra that inspired SQL, but brings in more robust computer science principles to make it more manageable as you scale in complexity. In this episode he shares his motivation for creating the Preql project, how he has used Python to develop a new language for interacting with database engines, and the challenges of taking on the legacy of SQL as an individual.
28 July 2021 •
When you start working on a data project there are always a variety of unknown factors that you have to explore. One of those is the volume of total data that you will eventually need to handle, and the speed and scale at which it will need to be processed. If you optimize for scale too early then it adds a high barrier to entry due to the complexities of distributed systems, but if you invest in a lot of engineering up front then it can be challenging to refactor for scale. Modin is a project that aims to remove that decision by letting you seamlessly replace your existing Pandas code and scale across CPU cores or across a cluster of machines. In this episode Devin Petersohn explains why he started working on solving this problem, how Modin is architected to allow for a smooth escalation from small to large volumes of data and compute, and how you can start using it today to accelerate your Pandas workflows.
22 July 2021 •
With the rising availability of computation in everyday devices, there has been a corresponding increase in the appetite for voice as the primary interface. To accomodate this desire it is necessary for us to have high quality libraries for being able to process and generate audio data that can make sense of human speech. To facilitate research and industry applications for speech data Mirco Ravanelli and Peter Plantinga are building SpeechBrain. In this episode they explain how it works under the hood, the projects that they are using it for, and how you can get started with it today.
14 July 2021 •
If you are interested in a library for working with graph structures that will also help you learn more about the research and theory behind the algorithms then look no further than graph-tool. In this episode Tiago Peixoto shares his work on graph algorithms and networked data and how he has built graph-tool to help in that research. He explains how it is implemented, how it evolved from a simple command line tool to a full-fledged library, and the benefits that he has found from building a personal project in the open.
7 July 2021 •
Deep learning has largely taken over the research and applications of artificial intelligence, with some truly impressive results. The challenge that it presents is that for reasonable speed and performance it requires specialized hardware, generally in the form of a dedicated GPU (Graphics Processing Unit). This raises the cost of the infrastructure, adds deployment complexity, and drastically increases the energy requirements for training and serving of models. To address these challenges Nir Shavit combined his experiences in multi-core computing and brain science to co-found Neural Magic where he is leading the efforts to build a set of tools that prune dense neural networks to allow them to execute on commodity CPU hardware. In this episode he explains how sparsification of deep learning models works, the potential that it unlocks for making machine learning and specialized AI more accessible, and how you can start using it today.
30 June 2021 •
Brett Cannon has been a long-time contributor to the Python language and community in many ways. In this episode he shares some of his work and thoughts on modernizing the ecosystem around the language. This includes standards for packaging, discovering the true core of the language, and how to make it possible to target mobile and web platforms.
23 June 2021 •
The foundation of every ML model is the data that it is trained on. In many cases you will be working with tabular or unstructured information, but there is a growing trend toward networked, or graph data sets. Benedek Rozemberczki has focused his research and career around graph machine learning applications. In this episode he discusses the common sources of networked data, the challenges of working with graph data in machine learning projects, and describes the libraries that he has created to help him in his work. If you are dealing with connected data then this interview will provide a wealth of context and resources to improve your projects.
16 June 2021 •
The growth of analytics has accelerated the use of SQL as a first class language. It has also grown the amount of collaboration involved in writing and maintaining SQL queries. With collaboration comes the inevitable variation in how queries are written, both structurally and stylistically which can lead to a significant amount of wasted time and energy during code review and employee onboarding. Alan Cruickshank was feeling the pain of this wasted effort first-hand which led him down the path of creating SQLFluff as a linter and formatter to enforce consistency and find bugs in the SQL code that he and his team were working with. In this episode he shares the story of how SQLFluff evolved from a simple hackathon project to an open source linter that is used across a range of companies and fosters a growing community of users and contributors. He explains how it has grown to support multiple dialects of SQL, as well as integrating with projects like DBT to handle templated queries. This is a great conversation about the long detours that are sometimes necessary to reach your original destination and the powerful impact that good tooling can have on team productivity.
9 June 2021 •
Deep learning is gaining an immense amount of popularity due to the incredible results that it is able to offer with comparatively little effort. Because of this there are a number of engineers who are trying their hand at building machine learning models with the wealth of frameworks that are available. Andrew Ferlitsch wrote a book to capture the useful patterns and best practices for building models with deep learning to make it more approachable for newcomers ot the field. In this episode he shares his deep expertise and extensive experience in building and teaching machine learning across many companies and industries. This is an entertaining and educational conversation about how to build maintainable models across a variety of applications.
2 June 2021 •
Unit tests are an important tool to ensure the proper functioning of your application, but writing them can be a chore. Stephan Lukasczyk wants to reduce the monotony of the process for Python developers. As part of his PhD research he created the Pynguin project to automate the creation of unit tests. In this episode he explains the complexity involved in generating useful tests for a dynamic language, how he has designed Pynguin to address the challenges, and how you can start using it today for your own work.
25 May 2021 •
Natural language processing is a powerful tool for extracting insights from large volumes of text. With the growth of the internet and social platforms, and the increasing number of people and communities conducting their professional and personal activities online, the opportunities for NLP to create amazing insights and experiences are endless. In order to work with such a large and growing corpus it has become necessary to move beyond purely statistical methods and embrace the capabilities of deep learning, and transfer learning in particular. In this episode Paul Azunre shares his journey into the application and implementation of transfer learning for natural language processing. This is a fascinating look at the possibilities of emerging machine learning techniques for transforming the ways that we interact with technology.
18 May 2021 •
Machine learning is a tool that has typically been performed on large volumes of data in one place. As more computing happens at the edge on mobile and low power devices, the learning is being federated which brings a new set of challenges. Daniel Beutel co-created the Flower framework to make federated learning more manageable. In this episode he shares his motivations for starting the project, how you can use it for your own work, and the unique challenges and benefits that this emerging model offers. This is a great exploration of the federated learning space and a framework that makes it more approachable.
11 May 2021 •
Data exploration is an important step in any analysis or machine learning project. Visualizing the data that you are working with makes that exploration faster and more effective, but having to remember and write all of the code to build a scatter plot or histogram is tedious and time consuming. In order to eliminate that friction Doris Lee helped create the Lux project, which wraps your Pandas data frame and automatically generates a set of visualizations without you having to lift a finger. In this episode she explains how Lux works under the hood, what inspired her to create it in the first place, and how it can help you create a better end result. The Lux project is a valuable addition to the toolbox of anyone who is doing data wrangling with Pandas.
4 May 2021 •
Any project that is used by more than one person will eventually need to handle permissions for each of those users. It is certainly possible to write that logic yourself, but you'll almost certainly do it wrong at least once. Rather than waste your time fighting with bugs in your authorization code it makes sense to use a well-maintained library that has already made and fixed all of the mistakes so that you don't have to. In this episode Sam Scott shares the Oso framework to give you a clean separation between your authorization policies and your application code. He explains how you can call a simple function to ask if something is allowed, and then manage the complex rules that match your particular needs as a separate concern. He describes the motivation for building a domain specific language based on logic programming for policy definitions, how it integrates with the host language (such as Python), and how you can start using it in your own applications today. This is a must listen even if you never use the project because it is a great exploration of all of the incidental complexity that is involved in permissions management.
27 April 2021 •
Being able to present your ideas is one of the most valuable and powerful skills to have as a professional, regardless of your industry. For software engineers it is especially important to be able to communicate clearly and effectively because of the detail-oriented nature of the work. Unfortunately, many people who work in software are more comfortable in front of the keyboard than a crowd. In this episode Neil Thompson shares his story of being an accidental public speaker and how he is helping other engineers start down the road of being effective presenters. He discusses the benefits for your career, how to build the skills, and how to find opportunities to practice them. Even if you never want to speak at a conference, it's still worth your while to listen to Neil's advice and find ways to level up your presentation and speaking skills.
20 April 2021 •
One of the great promises of computers is that they will make our work faster and easier, so why do we all spend so much time manually copying data from websites, or entering information into web forms, or any of the other tedious tasks that take up our time? As developers our first inclination is to "just write a script" to automate things, but how do you share that with your non-technical co-workers? In this episode Antti Karjalainen, CEO and co-founder of Robocorp, explains how Robotic Process Automation (RPA) can help us all cut down on time-wasting tasks and let the computers do what they're supposed to. He shares how he got involved in the RPA industry, his work with Robot Framework and RPA framework, how to build and distribute bots, and how to decide if a task is worth automating. If you're sick of spending your time on mind-numbing copy and paste then give this episode a listen and then let the robots do the work for you.
13 April 2021 •
When you are writing code it is all to easy to introduce subtle bugs or leave behind unused code. Unused variables, unused imports, overly complex logic, etc. If you are careful and diligent you can find these problems yourself, but isn't that what computers are supposed to help you with? Thankfully Python has a wealth of tools that will work with you to keep your code clean and maintainable. In this episode Anthony Sottile explores Flake8, one of the most popular options for identifying those problematic lines of code. He shares how he became involved in the project and took over as maintainer and explains the different categories of code quality tooling and how Flake8 compares to other static analyzers. He also discusses the ecosystem of plugins that have grown up around it, including some detailed examples of how you can write your own (and why you might want to).
6 April 2021 •
Writing code that is easy to read and understand will have a lasting impact on you and your teammates over the life of a project. Sometimes it can be difficult to identify opportunities for simplifying a block of code, especially if you are early in your journey as a developer. If you work with senior engineers they can help by pointing out ways to refactor your code to be more readable, but they aren't always available. Brendan Maginnis and Nick Thapen created Sourcery to act as a full time pair programmer sitting in your editor of choice, offering suggestions and automatically refactoring your Python code. In this episode they share their journey of building a tool to automatically find opportunities for refactoring in your code, including how it works under the hood, the types of refactoring that it supports currently, and how you can start using it in your own work today. It always pays to keep your tool box organized and your tools sharp and Sourcery is definitely worth adding to your repertoire.
30 March 2021 •
Becoming data driven is the stated goal of a large and growing number of organizations. In order to achieve that mission they need a reliable and scalable method of accessing and analyzing the data that they have. While business intelligence solutions have been around for ages, they don't all work well with the systems that we rely on today and a majority of them are not open source. Superset is a Python powered platform for exploring your data and building rich interactive dashboards that gets the information that your organization needs in front of the people that need it. In this episode Maxime Beauchemin, the creator of Superset, shares how the project got started and why it has become such a widely used and popular option for exploring and sharing data at companies of all sizes. He also explains how it functions, how you can customize it to fit your specific needs, and how to get it up and running in your own environment.
22 March 2021 •
Python is a language that is used in almost every imaginable context and by people from an amazing range of backgrounds. A lot of the people who use it wouldn't even call themselves programmers, because that is not the primary focus of their job. In this episode Chris Moffitt shares his experience writing Python as a business user. In order to share his insights and help others who have run up against the limits of Excel he maintains the site Practical Business Python where he publishes articles that help introduce newcomers to Python and explain how to perform tasks such as building reports, automating Excel files, and doing data analysis. This is a great conversation that illustrates how useful it is to learn Python even if you never intend to write software professionally.
16 March 2021 •
There are a large and growing number of businesses built by and for data science and machine learning teams that rely on Python. Tony Liu is a venture investor who is following that market closely and betting on its continued success. In this episode he shares his own journey into the role of an investor and discusses what he is most excited about in the industry. He also explains what he looks at when investing in a business and gives advice on what potential founders and early employees of startups should be thinking about when starting on that journey.
9 March 2021 •
Jupyter notebooks are a dominant tool for data scientists, but they lack a number of conveniences for building reusable and maintainable systems. For machine learning projects in particular there is a need for being able to pivot from exploring a particular dataset or problem to integrating that solution into a larger workflow. Rick Lamers and Yannick Perrenet were tired of struggling with one-off solutions when they created the Orchest platform. In this episode they explain how Orchest allows you to turn your notebooks into executable components that are integrated into a graph of execution for running end-to-end machine learning workflows.
2 March 2021 •
When you are writing a script it can become unwieldy to understand how the logic and data are flowing through the program. To make this easier to follow you can use a flow-based approach to building your programs. Leonn Thomm created the Ryven project as an environment for visually constructing a flow-based program. In this episode he shares his inspiration for creating the Ryven project, how it changes the way you think about program design, how Ryven is implemented, and how to get started with it for your own programs.
23 February 2021 •
One of the perennial challenges in software engineering is to reduce the opportunity for bugs to creep into the system. Some of the tools in our arsenal that help in this endeavor include rich type systems, static analysis, writing tests, well defined interfaces, and linting. Phillip Schanely created the CrossHair project in order to add another ally in the fight against broken code. It sits somewhere between type systems, automated test generation, and static analysis. In this episode he explains his motivation for creating it, how he uses it for his own projects, and how to start incorporating it into yours. He also discusses the utility of writing contracts for your functions, and the differences between property based testing and SMT solvers. This is an interesting and informative conversation about some of the more nuanced aspects of how to write well-behaved programs.
16 February 2021 •
Collaborating on software projects is largely a solved problem, with a variety of hosted or self-managed platforms to choose from. For data science projects, collaboration is still an open question. There are a number of projects that aim to bring collaboration to data science, but they are all solving a different aspect of the problem. Dean Pleban and Guy Smoilovsky created DagsHub to give individuals and teams a place to store and version their code, data, and models. In this episode they explain how DagsHub is designed to make it easier to create and track machine learning experiments, and serve as a way to promote collaboration on open source data science projects.
9 February 2021 •
Creating well designed software is largely a problem of context and understanding. The majority of programming environments rely on documentation, tests, and code being logically separated despite being contextually linked. In order to weave all of these concerns together there have been many efforts to create a literate programming environment. In this episode Jeremy Howard of fast.ai fame and Hamel Husain of GitHub share the work they have done on nbdev. The explain how it allows you to weave together documentation, code, and tests in the same context so that it is more natural to explore and build understanding when working on a project. It is built on top of the Jupyter environment, allowing you to take advantage of the other great elements of that ecosystem, and it provides a number of excellent out of the box features to reduce the friction in adopting good project hygiene, including continuous integration and well designed documentation sites. Regardless of whether you have been programming for 5 days, 5 years, or 5 decades you should take a look at nbdev to experience a different way of looking at your code.
2 February 2021 •
Working with network protocols is a common need for software projects, particularly in the current age of the internet. As a result, there are a multitude of libraries that provide interfaces to the various protocols. The problem is that implementing a network protocol properly and handling all of the edge cases is hard, and most of the available libraries are bound to a particular I/O paradigm which prevents them from being widely reused. To address this shortcoming there has been a movement towards "sans I/O" implementations that provide the business logic for a given protocol while remaining agnostic to whether you are using async I/O, Twisted, threads, etc. In this episode Aymeric Augustin shares his experience of refactoring his popular websockets library to be I/O agnostic, including the challenges involved in how to design the interfaces, the benefits it provides in simplifying the tests, and the work needed to add back support for async I/O and other runtimes. This is a great conversation about what is involved in making an ideal a reality.
26 January 2021 •
One of the common complaints about Python is that it is slow. There are languages and runtimes that can execute code faster, but they are not as easy to be productive with, so many people are willing to make that tradeoff. There are some use cases, however, that truly need the benefit of faster execution. To address this problem Kevin Modzelewski helped to create the Pyston intepreter that is focused on speeding up unmodified Python code. In this episode he shares the history of the project, discusses his current efforts to optimize a fork of the CPython interpreter, and his goals for building a business to support the ongoing work to make Python faster for everyone. This is an interesting look at the opportunities that exist in the Python ecosystem and the work being done to address some of them.
19 January 2021 •
Every software project has a certain amount of boilerplate to handle things like linting rules, test configuration, and packaging. Rather than recreate everything manually every time you start a new project you can use a utility to generate all of the necessary scaffolding from a template. This allows you to extract best practices and team standards into a reusable project that will save you time. The Copier project is one such utility that goes above and beyond the bare minimum by supporting project _evolution_, letting you bring in the changes to the source template after you already have a project that you have dedicated significant work on. In this episode Jairo Llopis explains how the Copier project works under the hood and the advanced capabilities that it provides, including managing the full lifecycle of a project, composing together multiple project templates, and how you can start using it for your own work today.
12 January 2021 •
On its surface Python is a simple language which is what has contributed to its rise in popularity. As you move to intermediate and advanced usage you will find a number of interesting and elegant design elements that will let you build scalable and maintainable systems and design friendly interfaces. Luciano Ramalho is best known as the author of Fluent Python which has quickly become a leading resource for Python developers to increase their facility with the language. In this episode he shares his journey with Python and his perspective on how the recent changes to the interpreter and ecosystem are influencing who is adopting it and how it is being used. Luciano has an interesting perspective on how the feedback loop between the community and the language is driving the curent and future priorities of the features that are added.
5 January 2021 •
Building a web application requires integrating a number of separate concerns into a single experience. One of the common requirements is a content management system to allow product owners and marketers to make the changes needed for them to do their jobs. Rather than spend the time and focus of your developers to build the end to end system a growing trend is to use a headless CMS. In this episode Jake Lumetta shares why he decided to spend his time and energy on building a headless CMS as a service, when and why you might want to use one, and how to integrate it into your applications so that you can focus on the rest of your application.
28 December 2020 •
Notebooks have been a useful tool for analytics, exploratory programming, and shareable data science for years, and their popularity is continuing to grow. Despite their widespread use, there are still a number of challenges that inhibit collaboration and use by non-technical stakeholders. Barry McCardel and his team at Hex have built a platform to make collaboration on Jupyter notebooks a first class experience, as well as allowing notebooks to be parameterized and exposing the logic through interactive web applications. In this episode Barry shares his perspective on the state of the notebook ecosystem, why it is such as powerful tool for computing and analytics, and how he has built a successful business around improving the end to end experience of working with notebooks. This was a great conversation about an important piece of the toolkit for every analyst and data scientist.
21 December 2020 •
When working with data it's important to understand when it is correct. If there is a time dimension, then it can be difficult to know when variation is normal. Anomaly detection is a useful tool to address these challenges, but a difficult one to do well. In this episode Smit Shah and Sayan Chakraborty share the work they have done on Luminaire to make anomaly detection easier to work with. They explain the complexities inherent to working with time series data, the strategies that they have incorporated into Luminaire, and how they are using it in their data pipelines to identify errors early. If you are working with any kind of time series then it's worth giving Luminaure a look.
15 December 2020 •
Technologies for building data pipelines have been around for decades, with many mature options for a variety of workloads. However, most of those tools are focused on processing of text based data, both structured and unstructured. For projects that need to manage large numbers of binary and audio files the list of options is much shorter. In this episode Lynn Root shares the work that she and her team at Spotify have done on the Klio project to make that list a bit longer. She discusses the problems that are specific to working with binary data, how the Klio project is architected to allow for scalable and efficient processing of massive numbers of audio files, why it was released as open source, and how you can start using it today for your own projects. If you are struggling with ad-hoc infrastructure and a medley of tools that have been cobbled together for analyzing large or numerous binary assets then this is definitely a tool worth testing out.
7 December 2020 •
Building a complete web application requires expertise in a wide range of disciplines. As a result it is often the work of a whole team of engineers to get a new project from idea to production. Meredydd Luff and his co-founder built the Anvil platform to make it possible to build full stack applications entirely in Python. In this episode he explains why they released the application server as open source, how you can use it to run your own projects for free, and why developer tooling is the sweet spot for an open source business model. He also shares his vision for how the end-to-end experience of building for the web should look, and some of the innovative projects and companies that were made possible by the reduced friction that the Anvil platform provides. Give it a listen today to gain some perspective on what it could be like to build a web app.
1 December 2020 •
In a software project writing code is just one step of the overall lifecycle. There are many repetitive steps such as linting, running tests, and packaging that need to be run for each project that you maintain. In order to reduce the overhead of these repeat tasks, and to simplify the process of integrating code across multiple systems the use of monorepos has been growing in popularity. The Pants build tool is purpose built for addressing all of the drudgery and for working with monorepos of all sizes. In this episode core maintainers Eric Arellano and Stu Hood explain how the Pants project works, the benefits of automatic dependency inference, and how you can start using it in your own projects today. They also share useful tips for how to organize your projects, and how the plugin oriented architecture adds flexibility for you to customize Pants to your specific needs.
23 November 2020 •
Building a machine learning model is a process that requires well curated and cleaned data and a lot of experimentation. Doing it repeatably and at scale with a team requires a way to share your discoveries with your teammates. This has led to a new set of operational ML platforms. In this episode Michael Del Balso shares the lessons that he learned from building the platform at Uber for putting machine learning into production. He also explains how the feature store is becoming the core abstraction for data teams to collaborate on building machine learning models. If you are struggling to get your models into production, or scale your data science throughput, then this interview is worth a listen.
17 November 2020 •
The CPython implementation has grown and evolved significantly over the past ~25 years. In that time there have been many other projects to create compatible runtimes for your Python code. One of the challenges for these other projects is the lack of a fully documented specification of how and why everything works the way that it does. In the most recent Python language summit Mark Shannon proposed implementing a formal specification for CPython, and in this episode he shares his reasoning for why that would be helpful and what is involved in making it a reality.
10 November 2020 •
Artificial intelligence applications can provide dramatic benefits to a business, but only if you can bring them from idea to production. Henrik Landgren was behind the original efforts at Spotify to leverage data for new product features, and in his current role he works on an AI system to evaluate new businesses to invest in. In this episode he shares advice on how to identify opportunities for leveraging AI to improve your business, the capabilities necessary to enable aa successful project, and some of the pitfalls to watch out for. If you are curious about how to get started with AI, or what to consider as you build a project, then this is definitely worth a listen.
3 November 2020 •
Python and Java are two of the most popular programming languages in the world, and have both been around for over 20 years. In that time there have been numerous attempts to provide interoperability between them, with varying methods and levels of success. One such project is JPype, which allows you to use Java classes in your Python code. In this episode the current lead developer, Karl Nelson, explains why he chose it as his preferred tool for combining these ecosystems, how he and his team are using it, and when and how you might want to use it for your own projects. He also discusses the work he has done to enable use of JPype on Android, and what is in store for the future of the project. If you have ever wanted to use a library or module from Java, but the rest of your project is already in Python, then this episode is definitely worth a listen.
26 October 2020 •
The release of Python 3.9 introduced a new parser that paves the way for brand new features. Every programming language has its own specific syntax for representing the logic that you are trying to express. The way that the rules of the language are defined and validated is with a grammar definition, which in turn is processed by a parser. The parser that the Python language has relied on for the past 25 years has begun to show its age through mounting technical debt and a lack of flexibility in defining new syntax. In this episode Pablo Galindo and Lysandros Nikolaou explain how, together with Python's creator Guido van Rossum, they replaced the original parser implementation with one that is more flexible and maintainable, why now was the time to make the change, and how it will influence the future evolution of the language.
19 October 2020 •
The way that applications are being built and delivered has changed dramatically in recent years with the growing trend toward cloud native software. As part of this movement toward the infrastructure and orchestration that powers your project being defined in software, a new approach to operations is gaining prominence. Commonly called GitOps, the main principle is that all of your automation code lives in version control and is executed automatically as changes are merged. In this episode Victor Farcic shares details on how that workflow brings together developers and operations engineers, the challenges that it poses, and how it influences the architecture of your software. This was an interesting look at an emerging pattern in the development and release cycle of modern applications.
12 October 2020 •
Learning to code is a neverending journey, which is why it's important to find a way to stay motivated. A common refrain is to just find a project that you're interested in building and use that goal to keep you on track. The problem with that advice is that as a new programmer, you don't have the knowledge required to know which projects are reasonable, which are difficult, and which are effectively impossible. Steven Lott has been sharing his programming expertise as a consultant, author, and trainer for years. In this episode he shares his insights on how to help readers, students, and colleagues interested enough to learn the fundamentals without losing sight of the long term gains. He also uses his own difficulties in learning to maintain, repair, and captain his sailboat as relatable examples of the learning process and how the lessons he has learned can be translated to the process of learning a new technology or skill. This was a great conversation about the various aspects of how to learn, how to stay motivated, and how to help newcomers bridge the gap between what they want to create and what is within their grasp.
6 October 2020 •
Python is a powerful and expressive programming language with a vast ecosystem of incredible applications. Unfortunately, it has always been challenging to share those applications with non-technical end users. Gregory Szorc set out to solve the problem of how to put your code on someone else's computer and have it run without having to rely on extra systems such as virtualenvs or Docker. In this episode he shares his work on PyOxidizer and how it allows you to build a self-contained Python runtime along with statically linked dependencies and the software that you want to run. He also digs into some of the edge cases in the Python language and its ecosystem that make this a challenging problem to solve, and some of the lessons that he has learned in the process. PyOxidizer is an exciting step forward in the evolution of packaging and distribution for the Python language and community.
29 September 2020 •
Servers and services that have any exposure to the public internet are under a constant barrage of attacks. Network security engineers are tasked with discovering and addressing any potential breaches to their systems, which is a never-ending task as attackers continually evolve their tactics. In order to gain better visibility into complex exploits Colin O'Brien built the Grapl platform, using graph database technology to more easily discover relationships between activities within and across servers. In this episode he shares his motivations for creating a new system to discover potential security breaches, how its design simplifies the work of identifying complex attacks without relying on brittle rules, and how you can start using it to monitor your own systems today.
22 September 2020 •
News media is an important source of information for understanding the context of the world. To make it easier to access and process the contents of news sites Lucas Ou-Yang built the Newspaper library that aids in automatic retrieval of articles and prepare it for analysis. In this episode he shares how the project got started, how it is implemented, and how you can get started with it today. He also discusses how recent improvements in the utility and ease of use of deep learning libraries open new possibilities for future iterations of the project.
15 September 2020 •
Data applications are complex and continually evolving, often requiring collaboration across multiple teams. In order to keep everyone on the same page a high level abstraction is needed to facilitate a cross-cutting view of the data orchestration across integration, transformation, analytics, and machine learning. Dagster is an innovative new framework that leans on the power and flexibility of Python to provide an extensible interface to the complete lifecycle of data projects. In this episode Nick Schrock explains how he designed the Dagster project to allow for integration with the entire data ecosystem while providing an opinionated structure for connecting the different stages of computation. He also discusses how he is working to grow an open ecosystem around the Dagster project, and his thoughts on building a sustainable business on top of it without compromising the integrity of the community. This was a great conversation about playing the long game when building a business while providing a valuable utility to a complex problem domain.
7 September 2020 •