Here is a list of development areas that I find myself often coming back to when thinking about how and where the ControlTier project can be evolved. This list is in no particular order of importance. It tends to be strategic improvements either aiming to improve new user experience or facilitating scale up. Some improvements are incremental, achievable in one development cycle, while others more profound and grandiose that might result in major version changes. Also, these issues come from the perspective of someone with an internals perspective of the ControlTier system, and has been immersed in the ControlTier paradigm for a very long time (i.e., me). Newer users will have a fresher and perhaps more practical assessment.
Depending on interest these ideas may be discussed further as CTIPs or even implemented in future ControlTier versions (See the Sourceforge tracker for a list of logged feature requests). Feel free to leave your comments on the talk (discussion) page or voice your opinions on the Google group.
How might future development be planned out? Here's what I had in mind:
- Continue current fixes and minor enhancements on the 3-4-support branch
- Open up a "roadmap wish list" discussion about tactical and strategic needs by using this one as a point of departure
- Organize it and prioritize the roadmap objectives
- Publish the roadmap with phasing and scheduling
- Establish two development tracks:
- ControlTier Vandelay: Establish the 3-5-development branch where compatibility and migration support will be added. Later the 3-5 users can converge to 4-0
- ControlTier TNG: 4-0-development branch where major refactoring/architecture work will be done. Release 4-0 snapshots to the community for feedback
- Drive the work forward!
Rationalize conceptual terms
ControlTier is really a set of independent projects that have evolved over the years. Each of these projects talks about a common set of concepts but sometimes using different terms. As our thinking has developed and we faced new use cases and challenges, we sometimes revised some of their names because we thought they would correspond to better known idioms. Unfortunately, these inconsistencies create an unnecessary complexity to learning how to use ControlTier and make documentation difficult to write.
The terms below are a case in point as they all describe the same thing in different ControlTier application contexts (e.g., between Workbench, CTL):
- object: An instance of a Type
- resource: An instance of a Resource
- entity: An instance of a Managed-Entity (itself a Resource)
- instance: An instance of a Type
Rationalizing terms may also include renaming some of the core types and attributes. For example, Managed-Entity might be called Managed-Resource and Updater to a name reflecting it does a Build and Deploy. The "buildstamp" attribute and option could be dropped in favor of "version". Rationalizing terms in the software code will also make the software internally coherent, and easier to maintain across projects.
An effort to create a mapping of terms, highlighting overlap and ambiguity would be a good first step. Later this mapping can be used to agree on a standard set of terms used throughout the software projects and documentation.
As of 3.4.9, there are three graphical interfaces provided by separate webapps: Jobcenter, Reportcenter and Workbench. It is also apparent there are three classes of users: process managers (those that execute commands and manage services within the environment), process developers (developing supporting resource models), module developers (those that define new types, constraints and workflows). The ControlTier toolchain should be organized to reflect these kinds of users.
An operations console would provide a unified tool for process managers. This new centralized view of CTL operations would combine the functionality of Jobcenter and Reportcenter along with new features.
- Concentrated summary of activity and available control actions.
- List, preview and execute loaded process definitions.
- Filter, list and display reported activity.
- Configured notification rules that email digests and individual reported events
- Define, filter list and view operational resources (Nodes, Services and Controllers)
- Reports on Node and Service availability events
Workbench is the current modeling tool supporting all classes of users and is really several tools in one.
- Firstly, it provides an editing and visualization environment for the resource model (some people call this the CMDB). See Graphical resource editor
- Secondly, it acts as an IDE to developing commands and modules (this part used to be called ModuleBuilder). See Graphical type editor
- Thirdly, its "Package Manager" provides an interface to packaged artifacts stored in the WebDAV, while the "Node Manager" gives an inventory view into the CTL nodes. See: File share
So on the one hand, Workbench is meant to provide an operational view of the world, while on the other, a devloper's design tool. Because of this, new users are often overwhelmed. One idea is to better partition the functionality so they aren't so blended together. A more radical approach would be to break the app down into separate components (see #Operations console above).
Improve text-based definitions
ControlTier provides web-based graphical user interfaces to author system definitions like jobs, resources and commands but often experienced users prefer text-based approaches as they can be maintained in an SCM and easily shared between team members. There are a variety of formats and some have expressed desire for better approaches.
- Consolidate XML tag set
- There are several XML files used to author various ControlTier definitions (job.xml, project.xml, type.xml, pview.xml, Ant datatypes). Each of these files were developed at different times, by different people and for different purposes and so do not strictly share the same tag names even though the tags may semantically represent the same things. This is in the same vein as #Rationalize conceptual terms.
- Automated model management
- Today most users maintain their model as static definitions in text files (maintained in an SCM). Sometimes it is necessary to make a change that cuts across many resources, such as dependencies between Deployment and Packages. ControlTier includes the "Change-Dependencies" command to automate this kind of change but it is not generic, occurs just on the server, and is a bit complex. Some have expressed a preference for a pure files-based trasformation process (e.g., JAXB for the project.xml) while others prefer scripted approaches.
- Domain specific language
- Some find the XML-based formats cumbersome and verbose (eg. "too pointy"). One idea would be to define a simple domain specific language used to express system definitions. The trick is balancing language scope and programming flexibility (not to mention debates over the "right" base language).
- Reconcile resource model representation
- Users define a resource model in an XML file called project.xml. It does not in fact represent the model of a project but rather just a portion of the resources defined within the project (other definitions also consitute a model — project organization, description, type definition, revision info, etc.).
Self contained process definition
Users following the so called "model-driven approach" must understand several aspects about ControlTier before they can become productive users. They must know the model semantics (the core types and how they interact via dispatch command), the standard lifecycles and corresponding workflow commands, how to model resources in project.xml, perhaps define their own workflows in type.xml, and appreciate its solution development methodology. This is quite a learning curve for new users. A preferred approach would be to have a single definition file that lets users define a process, starting simply with one step and easily composing up from there.
A self contained process definition is one where a new user with limited understanding of the ControlTier system can — using scripting skills and a cursory exposure to the definition language — automate multi-step procedures across hosts and resources in the network.
Scale up resource model loading
Resource model data is often declared in a "project.xml" file. The current project.xml data loader, ProjectBuilder#load-resources is fairly slow and can become a bottleneck when reloading a project from sources. Part of the reason for the slowness, is due to all the work being done client side, and the subsequent chattiness with the server.
Loading resource model data should be a server-side task wherein the server is passed an XML file and it does all the necessary checking and model modification. Further, the project.xml format should include attributes to instruct the processor to modify/replace/augment the affected resources.
Reconcile attributes and type properties
Currently, users define the equivalent of type-level properties using Attributes and Setting subtypes and attribute defaults. The definition process is already a bit tedious but the real concern lies in users of the type having to know how to assign their own attribute values. The Workbench UI doesn't support this use case very well either leading many new users into a corner.
A simpler albeit more radical idea would be to allow users to define true type-level properties. That is the best strategic solution.
The first ControlTier software project began in 2004 using a toolchain and practices current to that time (e.g., CVS, maven1, Java 1.3, Struts 1.x). Upgrades to newer tools have gradually been made (e.g., SVN, Java 1.6, Grails/Groovy) but I think it is time to undertake a wholesale modernization. Workbench, CTL, modules, installers are still built with maven1 while Grails apps, ReportCenter and Jobcenter are built using Grails procedures. See related google thread. The ControlTier distribution lends itself to dependency driven builds and maven2 seems the obvious choice for this.
- Standard build tool
- A proper modern build configuration can foster adoption and contribution. Maven 2 seems the likely choice. Should start with CTL as a first candidate.
- Continuous integration
- Regular use of a CI server (eg Hudson) is also critical to code integrity and quality. Again, CTL makes a good first candidate.
- Consolidate sources
- Related to the build configuration, is source code management. ControlTier sources are spread across four Sourceforge projects (controltier, ctl-dispatch, webad, moduleforge). It would simplify the development if some or all of these projects were consolidated into one repository.
- Project hoster
- Finally, consolidating the sources into one repository presents an opportunity to use a new SCM system and project hosting provider (eg, GitHub/Git).
- Also see CTIP - Project hosting provider change
Finally, adopting a common web framework (see #Workbench upgrades) simplifies architectural design. Grails is the obvious choice as it has been a good enabler helping yield Jobcenter and Reportcenter.
The ControlTier server functionality should be accessible over REST style interfaces. Workbench uses a castor based XML marshaling between client and server for model data management. Reportcenter uses an XML-based query API. Jobcenter completely lacks a network accessible API. An adequate set of REST interfaces will facilitate external tool integration. This topic also fits with the "Reconcile CTL Server" discussion also in this document.
Here are a few ideas for public interfaces:
- Resource model
- CRUD API to resource model
- Command execution
- CRUD+Job control
- An alternative to the custom log4j port, an HTTP interface would not require a special port access.
- Automated resource staging to CTL nodes
- CTL assumes modules and objects are already present when executing a command. This becomes an administration annoyance as users must know to first distribute modules and install objects ahead of time. For example, when users define resources in a model, assigning Node referrers, it is necessary to run
ctl-project -a installacross all the Nodes where that object was assigned. Missing this is a frequent pitfall. Likewise, users must run
ctl -I .* -p project -m Managed-Entity -c Install-Module -- -module moduleto have modules downloaded and installed. An automated process should distribute these resources perhaps on a regular basis maybe via Jobcenter (and/or assisted by Workbench).
- Alternative CTL nodedispatch transport
- CTL nodedispatching occurs over SSH which while a ubiquitous Unix service is not necessarily ideal for all users. For one thing, SSH is not ubiquitous for Windows users and requires an SSH server installed (perhaps using Cygwin a tedious dependency). Further, the SSH connections are not long running and therefore a setup/teardown is required for each dispatched command. Finally, it is difficult to implement a asynchronous workflow using this 'fire and wait until complete' model.
- Reconcile the "CTL server"
- CTL was designed to be a lightweight command execution framework that could work in a server-less configuration but practically speaking, centralized management greatly benefits from a server. Currently, Jobcenter performs the role of a central point of control to execute CTL commands while Workbench is used to populate model and module artifacts to the WebDAV. As mentioned elsewhere here, CTL should have a consistent view of resources it needs to retrieve from central repositories (ie., pulling files from a plain web server). Some view Jobcenter as the "real CTL" application as it embeds CTL and provides a user interface. In this view, Jobcenter would be installed on all clients and become a long running process there, additionally offering a network-accessible client interface. Taking this approach, Workbench could be refactored so it does not use its "CtlIntegration" code. Instead, it would be left to Jobcenter to take care of those tasks.
- No Workbench dependency
- As an extension to the #RESTful interfaces idea, it would be more scaleable for CTL's "Get-Properties" requests to be straight HTTP/GET's of files rather than Workbench requests. This would lend itself to caching proxy servers. Note, the Managed-Entity#Get-Properties implementation works this way but most access is done via Deployment#Get-Properties which makes Workbench requests.
- Reconcile CTL data model
- Commands execute an a property context prepared by the Get-Properties command. The representation returned is a Java/Ant property file (see Entity.properties) which differs from the project.xml format often used during initial setup. There should be closer correspondence between the two. Many users actually assume that the project.xml they upload to the server will later be returned to them during command execution.
Better support for managing virtualized infrastructure
Many ControlTier users manage environments made up of both physical and virtual servers. Beyond these in-house managed hosts, are those from cloud providers. ControlTier is well suited for managing distributed environments since it provides abstractions to application configurations. More can be done to support virtualized environments though ranging from: automating VM provisioning, modeling Nodes that come in and out of service, integrating with VM management tools (e.g., to query node descriptions).
One possible improvement would be to model the state of a VM Node in the Node type, supporting the notion of a Node being declared in the VM manager but offline/suspended/powered-off.
Workbench (aka itnav) was the first ControlTier project, therefore the oldest, and assuming we don't want to make radical architectural design changes, could use an overhaul.
- RDF is the underlying model representation used within Workbench. Workbench is based on a fairly old version of Jena (2.4) and newer versions have high performance optimizations and backends (TDB/SDB) and use the now standard SPARQL query language. Workbench should be upgraded to the newest stable version of Jena to resolve outstanding bugs and eliminate the use of now defunct RDQL.
- Web framework
- This is part of development modernization, too. Workbench is a Struts-based application and a very old Struts implementation at that (v1.1!). Struts has continued to evolve (combining with WebWork) but perhaps a better approach would be to use the same framework used by the newer webapps, Jobcenter and Reportcenter. Jobcenter and Reportcenter are both Grails-based webapps and we find them easier to enhance and maintain. Consolidating to a common app framework will reduce maintenance burden overall.
Some might call our modules "plug-ins" but in any case its the way we extend the framework.
Development and release
ControlTier includes all the core and elements modules as part of the main distribution. Some of these modules are well documented, mature and stable while others experimental and not yet production quality — yet all are still included. Other open source projects might provide a website and mechanism for users to find and download just what they need. This promotes participation from developers outside of the core committers and lets each extension evolve along its own development path.
ControlTier core types establish a small but effective set of complimentary abstractions. For example, Service represents any long running application deployment, while Package represents a deployable packaged-artifact. Users specialize behavior through subtyping. The downside to subtyping is it requires users to know about these new types, obscuring the core abstraction. Another approach is to separate the implementation from the core abstraction (cf Java interface), and using configuration to tie the two together. This could be done with a convention around command hooks and called scripts and/or the strategy pattern.
Modules can be developed as a set and released via a library or extension. Neither form supports library-to-library dependencies. Today, modules are deployed to CTL via the Managed-Entity#Install-Module command usually indirectly by ctl-project. This creates an administrative burden and would be improved by a better mechanism. Module-to-module dependencies are supported only for supertype relationships. Besides supertype dependencies, there can be other related modules (perhaps due child resource allowed-type constraints). The concept of seed and extension should be reconciled as both provide a structure for multiple modules. The extension has the advantage of the ctl-extension tool.
Public module distribution
It would be attractive to have a CPAN-like global repository where users can get publicly shared modules. This avoids having to install all types in a project even if they are not required and stay on top of new updates.
Standardized module unit test framework
There is little unit testing included in the module code base now. The "coretests" module supports unit tests for the utility types and there are now a couple for Elements modules. The unit testing practice needs to be better baked into the module development cycle. Requiring some level of testing will also encourage better module design, making strategy-based (either by configured scripts or callouts to external modules) implementations commonplace.
Reportcenter offers a basic interface that lists reported events by Jobcenter, CTL commands and Workbench model changes. It offers no organization but rather provides a filtering interface selecting events that match the filter criteria. Users can configure an RSS feed from this filter but there is no output notification capability. Here's several approaches to improving reporting.
- Allow users to define triggers that will forward matching events via email. Subscriptions should two modes: immediate notification and daily digest reports.
- Events could be organized based on a simple view of the resource model and/or process categorization.