Abbass Marouni bio photo

Abbass Marouni

Engineering Leader

Email Twitter LinkedIn Github

All Posts

2024

FinSummarizer: AI-Powered Financial News Summarization and Market Impact Analysis

Navigating the fast-paced world of global financial markets demands more than just keeping up—it requires insight. The sheer volume of information from fina...

2023

Mini-Dolly: Unlocking the Potential of a Smaller LLM Model

Today, we’re diving into the fascinating world of Mini-Dolly. We all know how incredible LLM models are, but their massive size can be a bit of a downer – re...

2022

Poor Man's Technology Hype Cycles

I recently stumbled upon Gartner’s Hype Cycle Builder, which simply put allows you to build custom Gartner’s hype cycle graphs for a given technology innovat...

2019

Nantes Scala Meetup Talk : Exploring BeamIOs with Scio

I had the pleasure of giving a talk in the first Nantes Scala Meetup to be hosted in the new Talend offices in Nantes. I’ve been playing with Scio for some...

What changed in the Big data landscape from 2013 to 2019

I’ve been a loyal follower of Data Eng Weekly newsletter (formerly Hadoop Weekly) for the past 6 years, the newsletter is a great source for everything relat...

2018

Talk : Apache Beam Summit Europe 2018

I had the pleasure to give a talk during the first european Apache Beam summit held in London. I presented Talend Pipeline Designer (previous Talend Data Str...

2017

Mining crypto currencies with Apache Spark

Crypto currencies Crypto currencies are gaining more and more momentum, and the last Bitcoin surge (17/12/2017) where the exchange price almost hit the 20...

2014

Contributing to OpenStack Sahara Project

We’ve been using OpenStack for a while and so recently we decided to contribute back some of our work to the OpenStack Sahara project. OpenStack Sahara aims ...

Processing PCAP files with Hadoop

Processing network capture files is one of the several use cases of large scale processing in Hadoop using MapReduce. Network capture files record network ac...

Extending Apache Pig to filter your Big Data

Apache Pig provides a simple to use abstraction layer on top of Hadoop MapReduce. Pig allows us to define complex data flows using its scripting language Pig...

Dymestifying Apache Hive indexes

Apache Hive provides a data warehousing layer on top of Hadoop. Hive uses Hadoop’s MapReduce as its query engine to execute complex SQL-like queries over dat...

HDFS heterogeneous storage

Hadoop 2.4.0 was released last week and with it came the first part of the HDFS heterogeneous storage support. The idea behind HDFS heterogeneous storage is ...