At LinkedIn, Apache Kafka is used heavily to store all kinds of data, such as member activity, log storage, metrics storage, and a multitude of inter-service messaging. LinkedIn maintains multiple data centers with multiple Kafka cluster...
Learn more about how to migrate your Kafka cluster from one Zookeeper cluster to another without any user impact.
By Benson Ma, Alok Ahuja
Co-authors: Zihan Li, Sudarshan Vasudevan, Lei Sun, and Shirshanka Das Data analytics and AI power many business-critical use cases at...
Co-authors: Xiang Zhang and Jingyu Zhu Introduction The Lambda architecture has become a popular architectural style that promises...
Co-authors: Khai Tran and Steve Weiss Batch and streaming computations are often combined together in the Lambda architecture, but carry the cost of mainta...
By Ben Sykes
As the year draws to a close, we’re taking a look back at ten of our most popular 2019 articles on the LinkedIn Engineering Blog....
Co-authors: Krishnan Raman and Joey Salacup Editor's note: This blog has been updated. Monitoring big data pipelines often equates to...
Co-authors: Jon Lee and Wesley Wu Apache Kafka is a core part of our infrastructure at LinkedIn. It was originally developed in-house as a stream processin...
The LinkedIn feed relies on a ranked list of the most relevant content for a member. More than 80% of the feed is organic content created by people, compan...
Editor's note: This blog has been updated. Brooklin—a distributed service for streaming data in near real-time and at scale—has been...
At LinkedIn, Kafka is the de-facto messaging platform that powers diverse sets of geographically-distributed applications at scale. Examples include our di...
We are pleased to announce today the release of Samza 1.0, a significant milestone in the history of the project. Apache Samza is a...
How we scaled Spark streaming with a novel balanced Kafka reader for ingesting massive amount of logging events from Kafka in near…
Co-authors: Vivek Nelamangala and PJ Xiao Introduction to Notifications Social media are computer-mediated platforms that facilitate creation and sharing o...
Two and a half years ago, the Data Infrastructure SRE team at LinkedIn introduced Burrow, an advanced way to monitor Apache Kafka...
Co-authors: Max Wolffe and Akhilesh Gupta Introduction You can’t fix something if you don’t know there’s a problem. Measuring and...