What’s new in RapidMiner 7.4?
2017-02-27
Background Execution
This feature is only available for users with a Large license.
Processes can now be executed in the background of Studio while you work on a different process in the user interface.
Background Monitor
The Background Monitor displays the current state of the background processes and gives access to the results.
Parallel Loops
RapidMiner now features new parallel Loop operators which run multiple iterations at once, making full use of your available CPU cores (up to the limits of your RapidMiner license) which greatly speeds up your processes.
New parallelized Loop operator.
New parallelized Loop Values operator.
New parallelized Loop Attributes operator.
New parallelized Loop Files operator.
Please note that already existing processes will still use these old operators for compatibility reasons! To make use of the new operator within such processes, you need to manually replace the existing operators with the new ones.
Please note that already existing processes will still use these old operators for compatibility reasons! To make use of the new operator within such processes, you need to manually replace the existing operators with the new ones.
Order your Repository
The Repository panel now allows to switch between alphanumeric and chronological order.
Granting Additional Permissions to Extensions
Users with Large licenses can now grant additional permissions to unsigned extensions. This is done through a toggle in the Start-up section of the Settings dialog. These permissions are not enabled by default, because they increase the security risk by allowing unknown software to be run in your system.
Introducing SparkRM
RapidMiner Radoop 7.4 introduces SparkRM (available with the “Enterprise” license). With SparkRM any operator or process existing in RapidMiner Studio can be run in parallel in a Hadoop environment, leveraging Spark as the execution framework.
The user-defined Subprocess (i.e. visually defined code) in the new SparkRM meta-operator can contain any in-memory RapidMiner operator, including those from extensions. The operator encapsulates that subprocess and pushes it to Hadoop, where it is automatically executed inside of Spark on potentially multiple Hadoop nodes. The input data provided to the SparkRM operator is partitioned (according to the values of an attribute, linearly or just randomly) and distributed to the Hadoop nodes beforehand. The RapidMiner subprocess is then run on all those partitions, potentially in many Hadoop nodes. After execution, the result is merged if it’s a coherent dataset, or returned as a collection otherwise.
SparkRM opens up a variety of new use cases that can now be solved by Radoop natively on Hadoop, especially those that need an extension, like text analytics, process mining, time series analytics or forecasting and many more. For a more detailed guide, check the SparkRM: Process Pushdown section in the documentation.
SparkRM opens up a variety of new use cases that can now be solved by Radoop natively on Hadoop, especially those that need an extension, like text analytics, process mining, time series analytics or forecasting and many more. For a more detailed guide, check the SparkRM: Process Pushdown section in the documentation.