RECAP
About the RECAP project
RECAP will develop a radically novel concept in the provision of cloud services, where services are elastically instantiated and provisioned close to the users that actually need them via self-configurable cloud computing systems.
Idea and base assumptions
In the past decade, data centres have seen historically unrivalled development in scale, automation, and energy efficiency for ICT resource provisioning. Yet even with this unprecedented scale of technological development, cloud systems are still pushing the boundaries of what data centres can deliver in terms of reliable ICT capacity and energy efficiency, and the coming challenges from the Internet of Things (IoT) and the networked society are placing increasingly high demands on intelligent automation and adaptive resource provisioning as clouds grow out and between data centres.
The vision of the networked society details the interconnection of a myriad of devices ranging from distributed hand- held devices to (semi-)autonomous vehicles and robots (operating in manufacturing industry or on behalf of ordinary citizens) all connected into cloud systems. This vision brings new challenges for dependable infrastructure, automation and security. While data centres are some of today’s most advanced cyber-physical infrastructures, further research is needed to advance the state of the art, in order to fully leverage the resource management automation and optimisation potential.
The RECAP project will go beyond the current state of the art and develop the next generation of cloud/edge/fog computing capacity provisioning and remediation via targeted research advances in cloud infrastructure optimisation, simulation and automation. Building on advanced machine learning, optimisation and simulation techniques, the RECAP project will advance the state of the art in:
- The modelling of complex cloud applications and infrastructures via, e.g., the development of fine-grained and accurate application deployment and behaviour models for dynamic distributed cloud applications
- Application and component-level quality of service models for cloud (application and infrastructure) systems
- Automation of the creation of application and workload models, e.g., collecting data in a short time frame to support multiple orchestration systems focused on heterogeneous data centre infrastructures, creating and synchronising models for multiple clouds allocated in different geographic areas.
- Data centre infrastructure optimisation systems including, e.g., improved scheduling systems, decentralized monitoring and load balancing systems, and system management and control tools
- Simulation of large scale scenarios and cloud/edge/fog computing systems including, e.g., improved timings to reduce the time for simulations to support orchestration decisions, and detailed simulation models to obtain accurate measurements of storage, file systems, networks.
- Remediation of complex distributed systems and networks, e.g, automate the process of detection and correction of failures at network and infrastructure levels while maintaining QoS.
Objectives
Publication of annotated workload traces and artificial workload generators
To support development and experimentation with new applications and resource management systems for cloud, edge, and fog computing systems, RECAP provides an annotated fine-grained workload trace data that correlate application workloads and resource load levels to capture the relationship between applications and resources and an artificial workload generator models that can be used to dynamically derive new parameterized workloads that retain key statistical properties of the base workload traces.
Near real-time simulation-based decision support for heterogeneous distributed clouds
The complexity and heterogeneity of distributed edge, fog, and cloud computing environments surpasses the capabilities human operators. To abstract the complexities of these environments and provide decision and control support for application and resource administration, RECAP will provide near real-time simulation support for applications and application subsystems, infrastructure resources and resource management systems and experimentation with models and validation of project results.
Resource- and energy-efficient provisioning of infrastructure resource capacity
When provisioning resources to applications, it is important to both accurately capture application capacity requirements and to identify the most suitable resources to be allocated. RECAP provides infrastructure optimization models and mechanisms that determine what capacity is needed when and where, as well as tools to enact these decisions.
QoS-aware orchestration and remediation of critical applications and services
To achieve robust orchestration and remediation of critical applications, i.e. applications with strict QoS requirements, in distributed highly heterogeneous resource environments, RECAP provides a modelling framework coupled with innovative new self-adaptivity mechanisms that help applications to intelligently take continuous corrective actions to compensate for volatility and heterogeneity in environments.
Partners
Ulm University
UULMs role in the project is be three-fold: First, they are functioning as a project coordinator. This includes all aspects of project administration and communication between the project partners.
Second, UULM continue operating their infrastructure and offer to use it as a testbed for all partners. The close relationship with BelWü (cf. letter of intent) allows UULM to not only offer an intra data-centre like setting, but also to credibly emulate WAN-based access to provide a wide-area network and imitate a fog computing scenarios. Moreover, UULM contributes to the definition of application models and plans to extend its Cloudiator toolset to support the application and load models developed in the project.
Finally, UULM will provide a comprehend monitoring solution for large infrastructures, researched and developed in previous EC funded projects, and extend it to match the requirements of the RECAP project.
Umeå University
The UmU distributed systems research group has extensive experience in cloud and data center resource management and has participated in and led the technical work of several European and national research projects. Their expertise in resource management, analytics, and optimization techniques are well mapped to the core research agenda of the project and will, in combination with broad cloud domain knowledge, be very valuable to both the research and the exploitation efforts of the project.
The UmU research team will lead or participate in most of the project’s application and infrastructure modelling efforts, as well as contribute to the application placement optimization work. UmU will contribute scientific expertise in distributed systems, architectural modelling, statistical and machine learning, as well as optimization techniques.
Dublin City University
The DCU research team will lead the simulation-related work and coordinate the dissemination and exploitation efforts. The team will also play a role in end use requirements collection and specification.
DCU will provide research and expertise in the area of discrete event simulation modelling, development and experimentation, correlating and validating the outputs between the simulated and real environments. A simulation toolkit will be developed building on the existing research and technical development from DCU and other partners towards the modelling of and experimentation with scenarios utilising the knowledge refined and created as part of the project. This will result in scientific and technical output through providing for better analysis capabilities in understanding the readiness of a system/data centre for IoT, in providing prediction and analysis of different analytical optimisations without the usage of real resources and in delivering mechanisms towards understanding which data analysis techniques should be applied in real situations leading to real economic benefits.
IMDEA Networks
IMDEA Networks will lead WP5 (Data collection, visualisation and analysis), and will be in charge of tasks T5.2 and T6.4. The expertise of IMDEA networks will be mainly employed for application characterization, workload modelling and decomposition, QoS modelling, data gathering and efficient data analytics. For the latter, in particular, IMDEA will consider big data analysis and machine learning techniques successfully employed in previous projects.
In addition, IMDEA will carry over its experience with the implementation of software and protocols on real devices in order to help with the reproduction of artificial workload data for simulation purposes. Finally, IMDEA will be involved in the validation of the use cases against selected performance metrics.
Tieto
Tieto intends to evaluate 5G technologies and Machine Type Communication (MTC) aspects to support different industry and enterprise verticals. Based upon this their plans and targets fits very well with the goal of this project. Within the scope of the project they would evolve their solutions related to system observability to optimize the provisioned end-to-end services, individual network functions and infrastructure utilization.
Furthermore, they will refine the ability to simulate network characteristics for emerging use cases with new requirements on QoS, especially when it comes to increased reliability and reduced latency. Tieto will also evolve their test bed to simulate end-to-end Software Defined Network (SDN) capabilities and adapt it to an edge computing platform to support simulation of selective service distribution to the edge and to central clouds based on service requirements.
Linknovate
Linknovate team has an extensive experience in the operation of distributed systems serving real world users. In particular, Linknovate has a broad knowledge of the FiWare stack for distributed computing. Linknovate engineers have addressed the issue of continuously adapting its infrastructure and deployment nodes for answering different loads of users requests. Linknovate experience and usage data will help to the testing and evaluation of the methods and models originated in this project in production scenarios.
Linknovate will contribute to this project in a dual role. First, they will act as an active source of data (user interaction, computational load, network traffic, etc.) in order to design and adjust the methods and models of the project. Second, it will help demoing and testing the developed models in the context of an actual running complex search engine with real clients and requirements.
Intel
Intel’s CSL team are leveraging their expertise and experience in Software Defined Infrastructures (SDI) and cloud computing resource orchestration and management to contribute to the research and technical tasks of RECAP. Specifically, Intel’s involvement, research and contributions will revolve around the topics of infrastructure modelling and capacity planning optimization from the perspective of the cloud/fog service provider. The planning of capacity through the definition and use of utility functions will be one particular area of research.
Intel also foresee contributing to new simulation and emulation techniques for highly scalable infrastructure cloud/fog environments, and advancements in workload modelling and characterization.
Intel will also support the integration of any Intel open source telemetry and analytics frameworks that may be relevant, including potentially SNAP and TAP.
SATEC
The RECAP project will be a lever to consolidate the SATEC skills in the scopes of fog computing, Big Data (monitoring, and massive data processing), and particularly in terms of software deployment reliability and flexibility. The RECAP scenarios require a full advanced platform to develop and deploy complex big data, IoT and cloud systems. One of the SATEC’s goals will be the use of RECAP results in a wide range of fields and applications.
SATEC’s principal activity in the RECAP project is focused on Use Cases development and pilots/demonstrators development (definition, design implementation, execution and validation). Therefore, in the WP3 and WP8 SATEC will contribute in the design, integration, internal validation and pilots’ deployment and execution. In the WP4, WP5, WP6 and WP7 SATEC can contribute providing the real problems and needs in the industry to be solved by the academia and will try to convert research results in solutions near to the market. In the WP2 (Dissemination and Exploitation) SATEC will contribute moving the Use case closer to market.
BT
BT is an active participant in European collaborative research and development and has participated in all the past EU Framework Programmes, primarily in the ICT area. Within the 7th Framework Programme BT co-ordinated four projects, provided technical leadership on two further projects, and partnered in twenty four other projects. Several of these projects currently remain active.
BT’s role in the project is providing use cases, modelling data, application characterization and practical validation of results.
CERTH-ITI
CERTH’s main activity in the RECAP project is focused on the development and validation of the Large Scale Simulation Framework (WP7. This involve the design, implementation, and integration and validation of the simulation frameworks with the optimization engine for the envisioned use-cases. In WP2 (Dissemination and Exploitation) CERTH will contribute mainly by research efforts in order to increase the innovation capacity of software engineering and simulation discipline.