OrioleDB: Solving PostgreSQL wicked problems

Monday 15 April 2024

Latest Databases
OrioleDB: Solving PostgreSQL wicked problems

Oriole is a table storage extension for Postgres. It is designed to be a drop-in replacement for Postgres' existing storage engine. In this article, we discuss how OrioleDB solves some of the wicked problems in PostgreSQL.

system-design
key-value-store
distributed-systems
ML/DL
Understanding Backpropagation Algorithm
14April
Understanding Backpropagation Algorithm

In this article, I would like to go over the mathematical process of training and optimizing a simple 4-layer neural network. I believe this would help the reader understand how backpropagation works as well as realize its importance.

Linear Algebra explained in the context of deep learning
13April
Linear Algebra explained in the context of deep learning

In this article, I have used top down manner to explain linear algebra for deep learning. First providing the applications and uses and then drilling down to provide the concepts.

How Transformers Work: A Detailed Exploration of Transformer Architecture
1April
How Transformers Work: A Detailed Exploration of Transformer Architecture

Explore the architecture of Transformers, the models that have revolutionized data handling through self-attention mechanisms.

A first intro to Complex RAG (Retrieval Augmented Generation)
9March
A first intro to Complex RAG (Retrieval Augmented Generation)

In this article, we discuss various technical considerations when implementing RAG, exploring the concepts of chunking, query augmentation, hierarchies, multi-hop reasoning, and knowledge graphs. We also discuss unsolved problems & opportunities in the RAG infrastructure space, and introduce some infrastructure solutions for building RAG pipelines.

How does Vector Database work?
6March
How does Vector Database work?

A vector database indexes and stores vector embeddings for fast retrieval and similarity search, with capabilities like CRUD operations, metadata filtering, horizontal scaling, and serverless. This article explains how vector databases work and how they can be used for semantic search and similarity matching.

Introducing txtai, the all-in-one embeddings database
5March
Introducing txtai, the all-in-one embeddings database

txtai is an all-in-one embeddings database for semantic search, LLM orchestration and language model workflows.

Retrieval Augmented Generation (RAG)
3March
Retrieval Augmented Generation (RAG)

Retrieval augmented generation (RAG) is an architecture that provides the most relevant and contextually-important proprietary, private or dynamic data to your Generative AI application's large language model (LLM) when it is performing tasks to enhance its accuracy and performance.

Fine-Tuning Large Language Models (LLMs)
3March
Fine-Tuning Large Language Models (LLMs)

We start by introducing key FT concepts and techniques, then finish with a concrete example of how to fine-tune a model (locally) using Python and Hugging Face’s software ecosystem.

Preparing Text Data for Transformers: Tokenization, Mapping and Padding
2March
Preparing Text Data for Transformers: Tokenization, Mapping and Padding

Transformers require a specific input format, which includes tokenization, mapping and padding. This article explains how to prepare text data for transformer models.

Introduction to Large Language Models and Transformer Architecture
2March
Introduction to Large Language Models and Transformer Architecture

Large language models are a type of neural network that can be trained to perform a variety of natural language processing tasks.

Web Development
Design a Payment System
4March
Design a Payment System

Implementing a payment system is not a leisurely task. Reliability and correctness are critical. Furthermore, successful business geenrate a lot of payment requests as they scale. Since even a small amount of downtime could mean a lot of lost revenue, it is important to design a payment system that is highly available and fault-tolerant. In this article, we will discuss the design of a payment system that is highly available and fault-tolerant.

Cloudflare ditched NGINX and open-sourced Pingora
3March
Cloudflare ditched NGINX and open-sourced Pingora

Cloudflare has open-sourced Pingora, a high-performance HTTP server that is designed to be a drop-in replacement for NGINX.

Databases
Explaining CREATE INDEX CONCURRENTLY
4March
Explaining CREATE INDEX CONCURRENTLY

This technical blog explains how CREATE INDEX CONCURRENTLY (CIC) works and how it manages to avoid locking the table from updates. A unique distinguishing factor of CIC is that it can build a new index on the table, without blocking it from updates/inserts/deletes.

Spotify