Category Archives: Data Engineering

Implementing the Speed Layer of Lambda Architecture using Spark Structured Streaming

This post is a part of a series on Lambda Architecture consisting of: Introduction to Lambda Architecture Implementing Data Ingestion using Apache Kafka, Tweepy Implementing Batch Layer using Kafka, S3, Redshift Implementing Speed Layer using Spark Structured Streaming Implementing Serving … Continue reading

Posted in Big Data, Data Engineering | 4 Comments

Implementing the Serving Layer of Lambda Architecture using Redshift

This post is a part of a series on Lambda Architecture consisting of: Introduction to Lambda Architecture Implementing Data Ingestion using Apache Kafka, Tweepy Implementing Batch Layer using Kafka, S3, Redshift Implementing Speed Layer using Spark Structured Streaming Implementing Serving … Continue reading

Posted in Big Data, Data Engineering | 4 Comments

Implementing the Batch Layer of Lambda Architecture using S3, Redshift and Apache Kafka

This post is a part of a series on Lambda Architecture consisting of: Introduction to Lambda Architecture Implementing Data Ingestion using Apache Kafka, Tweepy Implementing Batch Layer using Kafka, S3, Redshift Implementing Speed Layer using Spark Structured Streaming Implementing Serving … Continue reading

Posted in Big Data, Data Engineering | 4 Comments

Ingesting realtime tweets using Apache Kafka, Tweepy and Python

This post is a part of a series on Lambda Architecture consisting of: Introduction to Lambda Architecture Implementing Data Ingestion using Apache Kafka, Tweepy Implementing Batch Layer using Kafka, S3, Redshift Implementing Speed Layer using Spark Structured Streaming Implementing Serving … Continue reading

Posted in Big Data, Data Engineering | 4 Comments

Introduction to Lambda Architecture

This post is a part of a series on Lambda Architecture consisting of: Introduction to Lambda Architecture Implementing Data Ingestion using Apache Kafka, Tweepy Implementing Batch Layer using Kafka, S3, Redshift Implementing Speed Layer using Spark Structured Streaming Implementing Serving … Continue reading

Posted in Big Data, Data Engineering | 4 Comments

Windows functions in PostgresQL

1. Setting up postgresql on Mac OS Install postgresql: brew install postgresql Start postgres: brew services start postgresql Login to postgres shell to create user: /usr/local/bin/psql -d postgres Create the user: CREATE USER user PASSWORD ‘password’; If you use a … Continue reading

Posted in Data Engineering, SQL Server, T-SQL | Leave a comment

T-SQL Window functions syntax

Window functions are an advanced and powerful feature of the T-SQL language. I will give a few tips on how to use and examples on the AdventureWorks2014 OLTP database. Here I will give some notes on how to use them: … Continue reading

Posted in Data Engineering, SQL Server, T-SQL | Leave a comment