SPRING for APACHE HADOOP CHANGELOG ================================== http://www.springsource.org/spring-data/hadoop Commit changelog: http://github.com/SpringSource/spring-hadoop/tree/v[version] Issues changelog: http://jira.springsource.org/secure/ReleaseNote.jspa?projectId=10801 Changes in version 1.0 RC1 (2012-10-07) --------------------------------------- General * Introduced Hive, Pig runner for declarative script execution * Refactored all (Cascading, M/R, Hive, Pig) runners as Callables instead of FactoryBeans * Renamed 'pig' to 'pig-factory' and 'pig-ref' to 'pig-factory-ref' * Renamed 'hive-client' to 'hive-client-factory' and 'hive-client-ref' to 'hive-client-factory-ref' * Introduced pre and post actions to all (Cascading, M/R, Hive, Pig) runners * Introduced embedded execution of Hadoop jars * Improved spring-hadoop.xsd namespace * Improved, refined and extended reference documentation * Improved artifacts pom * Upgraded to Spring Batch 2.1.9 * Upgraded to Hive 0.9.0 * Upgraded to Pig 0.10.0 * Upgraded to Gradle 1.2 Package o.s.data.hadoop.cascading * Introduced FlowFactoryBean Package o.s.data.hadoop.configuration * Fixed potential cycle with FileSystem url registration Package o.s.data.hadoop.fs * Added codecs support to hdfs resources * Refined DistributedCache fragment creation for CDH4/Hadoop 0.23 distros * Introduced options for closing the FileSystem * Fine-tuned the DistributedCache API for setting cache entries Package o.s.data.hadoop.hbase * Refined resource management of HBase tables Package o.s.data.hadoop.hive * Addressed swallowed exception occuring script execution * Improved HiveQL parsing for multi-line statements * Introduced variable binding and substitution per Hive script * Refined namespace to preserve parameter ordering * Introduced HiveClient factory (to deal with thread-safety issues) * Introduced HiveTemplate & callback * Introduced extended exception conversion to DataAccessException * Introduced HiveRunner Package o.s.data.hadoop.mapreduce * Introduced scope attribute for job definitions * Introduced verbose flag to job tasklet * Introduced more options for job and streaming namespace * Introduced jar executor * Refined Tool and Jar execution to prevent class loading leaks * Refactored JobRunner FactoryBean into a Callable * Introduced namespace for job-runner * Removed path validation from JobFactoryBean Package o.s.data.hadoop.pig * Refined namespace to preserve parameter ordering * Introduced PigServer factory (to deal with thread-safety issues) * Introduced PigTemplate & callback * Introduced extended exception conversion to DataAccessException * Refined execution of Pig scripts * Introduced PigRunner Package o.s.data.hadoop.scripting * Refactored HdfsScriptFactoryBean into HdfsScriptRunner * Script definitions no longer cause execution on container lookup Changes in version 1.0 M2 (2012-06-13) -------------------------------------- General * Introduced support for Hadoop Security * Enhanced namespace declaration * Introduced DAO support (Template & Callback) for HBase * Added pig and hbase samples * Upgraded to Hadoop 1.0.3 Package org.springframework.data.hadoop.cascading * Added local Taps for Spring Framework Resource * Added local Taps for Spring Integration MessageHandler and MessageSource Package org.springframework.data.hadoop.configuration * Refined creation of Configuration objects to be 'old' API aware Package org.springframework.data.hadoop.fs * Added support for hftp, hsftp and webhdfs Package org.springframework.data.hadoop.mapreduce * Added jar-based class-loading support * Introduced support for Hadoop generic options * Improved diagnostics in JobRunner and JobTasklet * Introduced identify mapper/reducer as defaults * Improved propagation of the thread context classloader (tccl) for Tool execution Package org.springframework.data.hadoop.scripting * Extended HdfsScriptFactory to allow direct wiring of Hadoop Configuration Changes in version 1.0 M1 (2012-02-23) -------------------------------------- General * Aligned build system with Spring Framework 3.2 * Improved namespace * Introduced support for Hadoop Tool * Introduced support for Cascading Package org.springframework.data.hadoop.batch * Introduce Spring Batch ItemReader for HDFS Package org.springframework.data.hadoop.configuration * Introduced more utility methods Package org.springframework.data.hadoop.mapreduce * Introduced support for Hadoop Tool * Fixed incorrect return value/type for JobRunner Changes in version 0.9 RELEASE (2012-02-06) ------------------------------------------- Spring XML namespace with support for creating and/or configuring - Hadoop Configuration object - MapReduce and Streaming Jobs - HBase configuration - Hive server and Thrift client - Pig server instances that register and execute scripts either locally or remotely - Hadoop DistributedCache Spring XML namespace for executing scripts authored in JSR233 compatible scripting languages Support for executing HDFS operations in Groovy, JRuby, Jython or Rhino based on Hadoop Configuration Embedded shell API for HDFS Spring Batch Integration - tasklets for - Map Reduce and Streaming jobs - Hive - Pig - Script execution Sample applications Reference documentation