Please use this identifier to cite or link to this item: https://ruomoplus.lib.uom.gr/handle/8000/1138
Title: Pipeline-Based Linear Scheduling of Big Data Streams in the Cloud
Authors: Tantalaki, Nikoleta 
Souravlas, Stavros 
Roumeliotis, Μanos 
Katsavounis, Stefanos 
Author Department Affiliations: Department of Applied Informatics 
Department of Applied Informatics 
Author School Affiliations: School of Information Sciences 
School of Information Sciences 
Subjects: FRASCATI__Engineering and technology__Electrical engineering, Electronic engineering, Information engineering
FRASCATI__Natural sciences__Computer and information sciences
Keywords: stream processing
scheduling
big data
pipelines
distributed systems
Issue Date: 24-Jun-2020
Publisher: IEEE
Journal: IEEE Access 
ISSN: 2169-3536
Volume: 8
Start page: 117182
End page: 117202
Abstract: 
Nowadays, there is an accelerating need to efficiently and timely handle large amounts of data that arrives continuously. Streams of big data led to the emergence of several Distributed Stream Processing Systems (DSPS) that assign processing tasks to the available resources (dynamically or not) and route streaming data between them. Efficient scheduling of processing tasks can reduce application latencies and eliminate network congestions. However, the available DSPSs’ in-built scheduling techniques are far from optimal. In this work, we extend our previous work, where we proposed a linear scheme for the task allocation and scheduling problem. Our scheme takes advantage of pipelines to handle efficiently applications, where there is need for heavy communication (all-to-all) between tasks assigned to pairs of components. In this work, we prove that our scheme is periodic, we provide a communication refinement algorithm and a mechanism to handle many-to-one assignments efficiently. For concreteness, our work is illustrated based on Apache Storm semantics. The performance evaluation depicts that our algorithm achieves load balance and constraints the required buffer space. For throughput testing, we compared our work to the default Storm scheduler, as well as to R-Storm. Our scheme was found to outperform both the other strategies and achieved an average of 25%-40% improvement compared to Storm’s default scheduler under different scenarios, mainly as a result of reduced buffering (≈ 45% less memory). Compared to R-storm, the results indicate an average of 35%-45% improvement.
URI: https://doi.org/10.1109/ACCESS.2020.3004612
https://ruomoplus.lib.uom.gr/handle/8000/1138
DOI: 10.1109/ACCESS.2020.3004612
Corresponding Item Departments: Department of Applied Informatics
Department of Applied Informatics
Appears in Collections:Articles

Files in This Item:
File Description SizeFormat
20-Access.pdfFinal version, open access journal2,56 MBAdobe PDF
View/Open
Show full item record

SCOPUSTM   
Citations

28
checked on Feb 6, 2026

Page view(s)

135
checked on Feb 12, 2026

Download(s)

51
checked on Feb 12, 2026

Google ScholarTM

Check

Altmetric

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.