在对kettle进行二次开发的时候,发现kettle的机制里,对于R_STEP_TYPE表有个自动更新的用法。代码跟踪如下:
// // // R_STEP_TYPE // // Create table... boolean ok_step_type = true; table = new RowMeta(); tablename = KettleDatabaseRepository.TABLE_R_STEP_TYPE; schemaTable = databaseMeta.getQuotedSchemaTableCombination( null, tablename ); ...... // 中间省略,如果表存在,进行表数据更新 if ( ok_step_type ) { updateStepTypes( statements, dryrun, create );// 进行更新 if ( log.isDetailed() ) { log.logDetailed( "Populated table " + schemaTable ); } }转到updateStepTyps的方法:
public List<String> updateStepTypes( List<String> statements, boolean dryrun, boolean create ) throws KettleException { synchronized ( repository ) { // We should only do an update if something has changed... // List<PluginInterface> plugins = pluginRegistry.getPlugins( StepPluginType.class ); ObjectId[] ids = loadPluginsIds( plugins, create ); for ( int i = 0, idsLength = ids.length; i < idsLength; i++ ) { ObjectId id = ids[ i ]; if ( id == null ) { // Not found, we need to add this one... 核心步骤在这里,如果查询资源库,未发现插件的id,获取表的下一个类型id,插入到表中 if ( !create ) { id = repository.connectionDelegate.getNextStepTypeID(); } else { id = new LongObjectId( i + 1 ); } PluginInterface sp = plugins.get( i ); RowMetaAndData table = new RowMetaAndData(); table.addValue( new ValueMetaInteger( KettleDatabaseRepository.FIELD_STEP_TYPE_ID_STEP_TYPE ), id ); table.addValue( new ValueMetaString( KettleDatabaseRepository.FIELD_STEP_TYPE_CODE ), sp.getIds()[0] ); table.addValue( new ValueMetaString( KettleDatabaseRepository.FIELD_STEP_TYPE_DESCRIPTION ), sp.getName() ); table.addValue( new ValueMetaString( KettleDatabaseRepository.FIELD_STEP_TYPE_HELPTEXT ), sp.getDescription() ); if ( dryrun ) { String sql = database.getSQLOutput( null, KettleDatabaseRepository.TABLE_R_STEP_TYPE, table.getRowMeta(), table .getData(), null ); statements.add( sql ); } else { database.prepareInsert( table.getRowMeta(), null, KettleDatabaseRepository.TABLE_R_STEP_TYPE ); database.setValuesInsert( table ); database.insertRow(); database.closeInsert(); } } } } return statements; }也就是说,每次连接资源库,kettle其实是通过插件的code查询数据库获得code与id的映射关系,如果有返回,没有新增id。对于有对资源库表数据更新的操作需要关注这一点。