- Python Weekly
- Posts
- Python Weekly (Issue 666 September 5 2024)
Python Weekly (Issue 666 September 5 2024)
Python Weekly - Issue 666
Python Weekly
Welcome to issue 666 of Python Weekly. Let's get straight to the links this week.
From Our Sponsor
A weekly newsletter featuring the best hand curated news, articles, tutorials, talks, tools and libraries etc for programmers.
Articles, Tutorials and Talks
This tutorial guides coders through the fundamentals of large language models (LLMs), explaining how they work and how to build them from scratch in PyTorch. It covers coding a small GPT-like model, its data pipeline, architecture, pretraining, and fine-tuning using open-source libraries.
The article describes an attempt to classify a massive dataset of 8.4 million PDFs from Common Crawl using various machine learning techniques. The author experiments with different approaches, including deep learning models and traditional machine learning methods like XGBoost, ultimately achieving the best performance with an XGBoost model trained on embeddings, reaching 85.26% accuracy after hyperparameter tuning.
The article argues for using Python virtual environments in Docker containers, citing benefits like predictability, standardization, and easier debugging. The author contends that virtual environments provide a consistent, well-understood structure for Python applications, making communication and deployment across teams more straightforward, while also simplifying Python's import behavior.
This video presents a surprising “Let it burn” approach to error handling, demonstrating how allowing code to fail fast can result in simpler, clearer, and more robust software. Discover the benefits of this method and its impact on improving overall code quality.
We just released chDB version 2.0, which lets you query Pandas DataFrames 87x faster than 1.0. In this blog post we'll explain how we did it.
In the first part of this series, we created a Django online shop with htmx. In this second part, we'll handle orders using Stripe.
The post discusses various approaches to testing HTTP requests in Python applications, focusing on mocking external API calls during unit and integration testing.
This video tutorial demonstrates how to build a full-stack ChatGPT-like UI using Reflex, a Python framework for web development, integrating it with Neon Postgres database and OpenAI. It covers the entire process from setting up the development environment to deploying the application using Docker, GitHub Actions, and Ansible on a virtual machine.
The article provides a simple solution for macOS users to escape Anaconda's control over their Python environment by moving the .zshrc file out of the home directory. It offers step-by-step instructions for non-technical users to toggle between official Python and Anaconda versions without using command-line interfaces or editing files.
The Django ORM, how it compares to raw SQL and gotchas that you should be aware of when using it
Transform scattered logs into actionable insights with seamless Google Cloud integration for FastAPI apps.
The article discusses using GPT-4 with OpenAI's structured outputs feature to create an AI-assisted web scraper, exploring its capabilities in parsing complex tables and generating XPaths. While the author found GPT-4 effective at extracting data from various HTML tables, they also noted challenges with merged rows, high API costs, and the need for further refinements to improve accuracy and efficiency.
A step-by-step guide to developing your own pre-commit hook.
The tutorial teaches how to analyze multimodal data using Large Language Models (LLMs) and Python, covering text classification, image-based question answering, audio transcription, and creating a natural language query interface for SQL databases.
Interesting Projects, Tools and Libraries
Mini-Omni is an open-source multimodel large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
Linux Screen Recorder, Broadcaster, Capture and OCR with AI in mind.
Lightweight function pipeline (DAG) creation in pure Python for scientific workflows.
supertree is a Python package designed to visualize decision trees in an interactive and user-friendly way within Jupyter Notebooks, Jupyter Lab, Google Colab, and any other notebooks that support HTML rendering.
Cut video files with minimal recoding.
An open-source RAG-based tool for chatting with your documents.
A fun party trick to run Python code from another venv into this one.
A comprehensive resource for learning Natural Language Processing (NLP) from the basics to advanced topics. It contains Jupyter notebooks covering various NLP concepts, techniques, and implementations, making it a valuable guide for beginners and intermediate learners in the field of NLP.
A modern cookiecutter template for Python projects that use uv for dependency management.
New Releases
Upcoming Events and Webinars
There will be following talks
Empowering Django with Background Workers
Pydantic Logfire — Uncomplicated Observability
There will be following talks
Job search automation with data scraping and machine learning
Using AutoGluon (AutoML) for Image Classification or Semantic Segmentation
There will be following talks
Not your typical RAG application
Building a Training Course Outline with Azure OpenAI
Enhancing Retrieval Augmented Generation with GraphRAG
There will be following talks
Embedding Software Engineering Best Practices into Machine Learning Projects with Kedro
Optimizing Rail Traffic Control using a Digital Twin and Reinforcement Learning
There will be following talks
Aligning Signals: Key Learnings in LLMOps for Faster, Confident Development
Reusable AI: Customizing LLMs for Diverse Business Needs
There will be following talks
Aequitas Flow: A Fair ML optimization framework
Llama to Llama 3.1 -- a year and a half of open-access LLMs in retrospective
Our Other Newsletters
- A free weekly newsletter for programmers.
- A free weekly newsletter for entrepreneurs featuring best curated content, must read articles, how to guides, tips and tricks, resources, events and more.