Greedy Solutions to Complex Problems?
Reflecting in the build up to this year’s ARVO on the approaches used for OCT layer segmentation, two things are immediately apparent. Firstly, it’s a harder problem than people first thought. Marketing managers who ten years ago thought that, within a year or two, fully-automated, multi-layer analysis would be ubiquitous are still looking for solutions. And secondly, the public domain methods being used have boiled down dramatically to one of two approaches, each with their own strengths and weaknesses. A review that is just now five years old would look very different if written today [Debuc 2010]. The two methods are:
- Those involving graph-traversal, and
- Those involving graph-cuts.
This posting takes a quick look at these seemingly similar approaches, summarizes the pros and cons of each, discusses who is using what, and touches on some of the recent literature.
Perhaps because of its simplicity, methods of graph traversal now dominate as both commercial organizations and academia have been using these approaches for the past ten years. Within the literature, these are often billed as Dijkstra’s algorithm, dynamic programming, or the Bellman-Ford algorithm. Here are the pros and cons.
- Fast with linear (O(n)) complexity, depending on the implementation.
- Simple to implement, and very customizable in terms of cost images that are traversed.
- Seemingly highly appropriate for layer segmentation in 2d images.
- Inherently only a 2d image solution.
- Requires constraints in that they are only optimal given known start and end points, a scenario that is never guaranteed.
- Requires tuning for different pathologies – i.e., to encode smoothness constraints – and derive solutions based on a greedy evaluation of the path found.
Who is Using this Stuff?
The list is pretty large, but in academia we have at least the Debuc lab at Bascom Palmer, the VIP lab at Duke, the COOL lab at OHSU, and the pattern recognition lab in Erlangen-Nürnberg; and in industry we have RSIP Vision (commercially licensed, per the web site), Topcon, and Carl Zeiss Meditec (algorithm described in this paper). [Farsiu 2014] is the relevant patent in this area.
The undisputed leaders in this area is the group at Iowa led by Professor Sonka, who were the first to apply this method to OCT images publishing impressive results as early as 2006. Based on the Min-Cut/Max-Flow algorithm of [Boykov 2004], they added a special, layer-based parameterization which encoded a smoothness constraint into the optimization [Li 2006]. Many method alterations have happened since, and other groups have been happy to follow their lead.
- Inherently Nd, so ideal for volumetric processing.
- The solutions are optimal in terms of the cost terms.
- Highly tunable based on cost images and also the graph architecture.
- Very slow, with exponential complexity and large demands on memory, so not currently clinically feasible.
- They require accurate initialization, which has been mostly addressed with learning algorithms, meaning training data dictates the resulting solutions.
Who is Using this Stuff?
Professor Sonka’s lab at Iowa, Jerry Prince’s lab at JHU, the ARTORG center in Bern, and a lab at the Nanjing University of Science and Technology. In industry the methods are less adopted, although a new algorithm by Heidelberg Engineering may well be [Heidelberg 2014]. The relevant patent in this area is [Li 2006].
This is clearly an exciting and important research area. Clinical researchers tend to be experts at making the most of technology be it prime-time or not, so despite shortcomings in any method, utility shall be found as each has various strengths. The obvious caveat is that software still has to be usable, it has to fit and not define work-flow and it must operate reliably.
Recent literature [Tian 2016] compared some of the methods [Chiu 2011], [Dufour 2013], [Garvin 2008], [Lang 2013], [Tian 2015] and also that from Heidelberg Engineering [Heidelberg 2014], and is recommended reading to understand how the field has evolved using these methods.
Partly because we are not constrained to academic funding or corporate demands, Voxeleron’s approach to retinal layer segmentation is fully orthogonal to those we have discussed. It has to be, as running full 3D segmentations in 3-4 seconds a volume definitively states that our patented algorithms are not graph-based. Instead, we have the freedom to keep evolving our own innovative technologies, giving researchers useful tools that facilitate new areas of investigation. And to continue to do this, it is required we fully understand any alternative technologies that are out there.