Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
David Andrzejewski, Univ. of Wisconsin-Madison (USA) David G. Stork, Ricoh Innovations, Inc. and Stanford Univ. (USA) Xiaojin Zhu, Univ. of Wisconsin-Madison (USA) Ron Spronk, Queen's Univ. (Canada) 1 Visual arts Digital authentication of Bruegel, Perugino (Lyu et al, 2004) Jackson Pollock (Taylor, 1999) (Irfan and Stork, 2009) Writings Authorship of the Federalist Papers (Mosteller and Wallace, 1964) Ronald Reagan’s radio addresses (Airoldi et al, 2007) 2 http://www.artchive.com Haags Gemeentemuseum, The Hague 3 4 Better understand compositional style 1. Develop a formal representation of the paintings 2. Extract these representations from paintings 3. Train a generative model 4. Learn relative visual weights of colors 5. Classify true Mondrians versus 1. “fakes” created by the generative model in step 3 2. “earlier states” of the Transatlantic paintings 5 •Vertical/horizontal lines • locations • extents • Rectangles • locations • sizes • colors •can span multiple lines 6 7 8 Hypothesize an underlying probabilistic model that generates observed data Many uses in machine learning Make predictions (Naïve Bayes) Generate new examples (Markov model) Interpret parameter values (Linear regression) Given data, learn/train model parameters Our approach: Maximum likelihood estimation (MLE) 9 Canvas aspect ratios (kernel density estimator) 10 Number of horiz/vert lines (Poisson) Horiz/vert line spacing (Dirichlet) 11 Segments are deleted / invisible / left alone (Polya) 12 Rectangle colors (Multinomial) 13 Don’t allow unrealistic “hanging” lines Require ≥ 1 vertical line 14 Rectangle color Multinomial probability White 0.754 Red 0.085 Yellow 0.062 Blue 0.065 Black 0.034 Line type Spacing Dirichlet Vertical 1.80 Horizontal 1.61 15 Calculate visual “center of mass” Assume true Mondrians centered at [0.5,0.5] Learn color weights via linear programming Red Yellow Blue Black 0.237 0.143 0.227 0.392 16 Completed in Europe, but then altered after Mondrian’s arrival in the United States A variety of techniques (x-ray, UV, etc) were used to recover the earlier states (Cooper & Spronk, 2001 ) 17 Composition with Red, Blue, and Yellow (1937-1942) 18 Composition with Red, Yellow, and Blue (1935-1942) 19 No. 9 (1939-1942) 20 Very popular technique in machine learning At each iteration, choose a rule to “split” on Resulting partitions should be more “pure” with respect to target classification (true Mondrian or computer-generated fake?) Key feature: resulting trees easy to interpret Estimate accuracy with leave-one-out crossvalidation Control over-fitting with pruning 21 45 true Mondrians versus 45 generated “fakes” Classifier Accuracy Majority baseline 50% Decision tree (no pruning) 70% Decision tree (with pruning) 68% 45 true Mondrians versus 11 “earlier states” Classifier Accuracy Majority baseline 81% Decision tree (no pruning) 72% Decision tree (with pruning) 75% 22 Analysis of results Transatlantic dataset < 1% pixels blue # horiz / # vert < 0.9 Low visual “density” THEN Transatlantic 23 Formal representation and feature extraction Generative model Fitting simple statistics of Mondrians cannot create realistic synthetic paintings Color weights align well with our intuitions Classification Can reliably discriminate true Mondrians vs. computer- generated Cannot do so for true Mondrians vs Transatlantic “earlier states” ▪ Underlying images were “nearly complete” (!) 24