- newIoT Communication Protocols for Efficient Device Integration
What Are IoT Communication Protocols?IoT communication protocols are the standards and rules that allow devices to communicate over networks. They define how data is transmitted, how devices establish connections, and how they securely exchange info…
- 22 hours ago 5 Mar 25, 7:00pm - - newHarnessing Real-Time Insights With Streaming SQL on Kafka
In the era of real-time data, the ability to process and analyze streaming information has become critical for businesses. Apache Kafka, a powerful distributed event streaming platform, is often at the heart of these real-time pipelines. But working…
- 24 hours ago 5 Mar 25, 5:00pm - - AI Agents for Data Warehousing
The term "data warehousing" was first introduced in the 1980s, referring to the practice of storing data from various sources within an organization. The collected data is then utilized for reporting, decision-making, accurate analytics, better custo…
- 2 days ago 4 Mar 25, 2:00pm - - Materialized Views in Data Stream Processing With RisingWave
Incremental computation in data streaming means updating results as fresh data comes in, without redoing all calculations from the beginning. This method is essential for handling ever-changing information, like real-time sensor readings, social medi…
- 3 days ago 3 Mar 25, 6:00pm - - Modern Data Processing Libraries: Beyond Pandas
As discussed in my previous article about data architectures emphasizing emerging trends, data processing is one of the key components in the modern data architecture. This article discusses various alternatives to Pandas library for better performan…
- 3 days ago 3 Mar 25, 4:00pm - - Doris Lakehouse Integration: A New Approach to Data Analysis
In the wave of big data, the data volume of enterprises is growing explosively, and the requirements for data processing and analysis are becoming increasingly complex. Traditional databases, data warehouses, and data lakes operate separately, result…
- 6 days ago 28 Feb 25, 8:15pm - - Exploring IoT's Top WebRTC Use Cases
Around the world, 127 new devices are connected to the Internet every second. That translates to 329 million new devices hooked up to the Internet of Things (IoT) every month. The IoT landscape is expanding by the day, and, consequently, novel ways o…
- 6 days ago 28 Feb 25, 7:00pm - - Modern ETL Architecture: dbt on Snowflake With Airflow
The modern discipline of data engineering considers ETL (extract, transform, load) one of the processes that must be done to manage and transform data effectively. This article explains how to create an ETL pipeline that can scale and uses dbt (Data…
- 7 days ago 27 Feb 25, 11:15pm - - Top Methods to Improve ETL Performance Using SSIS
Extract, transform, and load (ETL) is the backbone of many data warehouses. In the data warehouse world, data is managed through the ETL process, which consists of three steps: extract—pulling or acquiring data from sources, transform—converting…
- 7 days ago 27 Feb 25, 9:45pm - - Cloud-Driven Analytics Solution Strategy in Healthcare
This paper examines the revolutionary possibilities of combining Apache Spark for real-time streaming analytics with cloud-based technologies, particularly AWS and Databricks. Using identity and access management (IAM) and encryption techniques, util…
- 7 days ago 27 Feb 25, 1:00pm - - How to Scale Elasticsearch to Solve Your Scalability Issues
With the evolution of modern applications serving increasing needs for real-time data processing and retrieval, scalability does, too. One such open-source, distributed search and analytics engine is Elasticsearch, which is very efficient at handling…
- 8 days ago 26 Feb 25, 8:30pm - - Spark Job Optimization
We are living in an age where data is of utmost importance, be it analysis or reporting, training data for LLM models, etc. The amount of data we capture in any field is increasing exponentially, which requires a technology that can process large amo…
- 9 days ago 25 Feb 25, 7:00pm - - The Future of Data Lakehouses: Apache Iceberg Explained
We know that data management today is changing completely. For decades, businesses relied on data warehouses, which stored information in an appropriate manner. They are structured, governed, and quick to extract information from, although expensive…
- 9 days ago 25 Feb 25, 5:00pm - - The Hidden Cost of Dirty Data in AI Development
Artificial intelligence operates as a transformative force that transforms various industries, including healthcare, together with finance and all other sectors. AI systems achieve their highest performance through data that has been properly prepare…
- 9 days ago 25 Feb 25, 4:00pm - - Deduplication of Videos Using Fingerprints, CLIP Embeddings
Video deduplication is a crucial process for managing large-scale video inventory, where duplicates consume storage, increase processing costs, and affect data quality negatively. This article explores a robust architecture for deduplication using…
- 13 days ago 21 Feb 25, 6:00pm - - Scaling Image Deduplication: Finding Needles in a Haystack
In the current AI generation, where organizations deal with a vast inventory of images, identifying duplicates can be a daunting task. Distributed deduplication at scale is essential for optimizing storage, reducing redundancy, and maintaining data i…
- 14 days ago 20 Feb 25, 5:00pm - - Data Pattern Automation With AI and Machine Learning
In the age of information that we live in today, every company faces challenges using data to its full potential. One technique for removing those hurdles is pattern recognition, a method by which automated processes are applied. With the amazing pro…
- 15 days ago 19 Feb 25, 6:00pm - - ETL Generation Using GenAI
Generating ETL data pipelines using generative AI (GenAI) involves leveraging the capabilities of large language models to automatically create the code and logic for extracting, transforming, and loading data from various sources, significantly redu…
- 20 days ago 14 Feb 25, 5:00pm - - Loading XML into MongoDB
There are many situations where you may need to export data from XML to MongoDB.Despite the fact that XML and JSON(B) formats used in MongoDB have much in common, they also have a number of differences that make them non-interchangeable.
- 22 days ago 12 Feb 25, 8:00pm - - The Right ETL Architecture for Multi-Source Data Integration
When building ETL (Extract, Transform, Load) pipelines for marketing analytics, customer insights, or similar data-driven use cases, there are two primary architectural approaches: dedicated pipelines per source and common pipeline with integration,…
- 22 days ago 12 Feb 25, 5:00pm -