Python Weekly (Issue 663 August 15 2024)

Python Weekly - Issue 663

Python Weekly

Welcome to issue 663 of Python Weekly. Let's get straight to the links this week.

Articles, Tutorials and Talks

This video demonstrates that there's a place for both object-oriented and functional code. In Python, these two approaches can be combined effectively, allowing you to leverage the strengths of each for the best results.

The article demonstrates how to approximate sum types in Python using Pydantic's tagged unions feature, providing a way to represent complex data structures with type safety. It explains the concept of sum types, shows how to implement them using Pydantic models, and discusses the benefits and limitations of this approach in Python programming.

How costly it is to call functions and builtins in your python code? Does inlining help? How have the recent CPython releases improved performance in these areas?

FlexAttention is a new PyTorch API that allows implementing various attention variants using idiomatic PyTorch code, which is then lowered into optimized FlashAttention kernels through torch.compile. It provides a flexible score_mod function for modifying attention scores and a mask_mod function for leveraging sparsity, enabling researchers to easily experiment with different attention mechanisms while maintaining high performance.

This post provides a detailed guide on how to scrape infinite scroll websites using Scrapy and Playwright in Python. It covers the setup process, explains how to implement a custom downloader middleware to handle JavaScript rendering, and demonstrates how to extract data from dynamically loaded content, offering a practical solution for web scraping challenges posed by modern web applications.

The article argues that CSVs (Comma-Separated Values) are problematic due to various edge cases involving delimiters, quotes, and newlines, and proposes using Delimiter-Separated Values (DSV) with ASCII control characters as a more robust alternative. It demonstrates how DSVs can handle complex data without escaping or quoting issues, but acknowledges that the lack of widespread tool support for this format is a significant drawback.

How to be efficiently lazy at finding hidden gems in predictable places – Database Edition.

The video covers setting up and managing Django files, including static and user-uploaded files, using Cloudflare's R2 object storage. It emphasizes best practices for configuring environment variables, securing API keys, and managing static and media files in Django with advanced validation and customization options.

Interesting Projects, Tools and Libraries

Real time face swap and one-click video deepfake with only a single image.

Bring AI models closer to your PostgreSQL data.

emval is a blazingly fast Python email validator written in Rust.

DeltaDB is a lightweight, fast, and scalable database built on polars and deltalake. 

No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents.

LinkedIn_AIHawk is a tool that automates the jobs application process on LinkedIn. Utilizing artificial intelligence, it enables users to apply for multiple job offers in an automated and personalized way.

An autoagentic AGI that is self-evolving and modular.

Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.

Design, conduct and analyze results of AI-powered surveys and experiments. Simulate social science and market research with large numbers of AI agents and LLMs.

Upcoming Events and Webinars

There will be following talks

  • Creating Domain Specific Languages with Python

  • Code Detective: Uncovering Complexity with Python’s Importlib and AST

There will be following talks

  • Containerizing and running a Python-based GenAI App

  • Hosting a Streamlit App on AWS

Our Other Newsletters

 - A free weekly newsletter for programmers.

- A free weekly newsletter for entrepreneurs featuring best curated content, must read articles, how to guides, tips and tricks, resources, events and more.