Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Ruby: An introduction - Who am I? Ruby: An introduction Presented by: Maciej Mensfeld senior ruby [email protected] lead ruby [email protected] [email protected] dev.mensfeld.pl github.com/mensfeld Maciej Mensfeld 1/23 Ruby: An introduction – please… Ruby: An introduction Please… • …ask me to slow down, if I speak to quickly; • …ask me again, if I forget; • …ask questions, if anything i say is not clear; • …feel free to share your own observations Maciej Mensfeld 2/23 Ruby: An introduction – What is Ruby? Ruby WT*? Ruby pictures Maciej Mensfeld 3/23 Ruby: An introduction – What is Ruby? What is Ruby? • • • • • • • Pure object-oriented programming language (even the number 1 is an instance of class); Created by Yukihiro Matsumoto in 1993; Freely available and open-source; Syntax is readable and easy to learn; Being used for text processing, web apps, general system administration, and AI and math research. Can be extended with Ruby or low-level C; Really helpful community; Maciej Mensfeld 4/23 Ruby: An introduction – What I love in Ruby? Clarity not ceremony – Main program Java: public class HelloWorld{ public static void main(String args){ System.out.println(„Hello World”); } } Ruby: puts „Hello World” Try it out! Maciej Mensfeld 5/23 Ruby: An introduction – What I love in Ruby? Expressive syntax && objects, objects, objects… 3.times { puts „Ruby is cool”} [„Maciek”, „John”, „Anna”].first #=> „Maciek” [„Maciek”, „John”, „Anna”].last #=> „Anna” attr_accessor :name „Anna”.class #=> String nil.class #=> NilClass 1.class #=> Integer {}.class #=> Hash [].class #=> Array self.class #=> Object (0..9).class #=> Range Maciej Mensfeld 6/23 Ruby: An introduction – syntax Ruby syntax – hello world as a function Hello World! puts „Hello World!” Try it out! Hello YourName! puts „Hello #{name}” def h puts „Hello World!” end h => „Hello World!” def h(name=„World”) puts „Hello #{name}!” end h („Maciek”)=> „Hello Maciek!” Maciej Mensfeld 7/23 Ruby: An introduction – syntax Ruby syntax – classes, methods, objects Try it out! Hello YourName! as an object # Comments starts with „#” class Messenger def initialize(name) # instance variables starts with „@” @name = name end public def hello puts „Hello #{@name }!” end end msg = Message.new(„Maciek”) msg.hello #=> „Hello Maciek!” Maciej Mensfeld 8/23 Ruby: An introduction – syntax Ruby syntax – arrays, hashes (dictionaries) Arrays names = [‘Maciek’, ‘John’, ‘Freddy’] names.length #=> 3 debts.length #=> 2 Hashes Maciej Mensfeld debts={„Maciek”=>1, „John”=> 10} 9/23 Ruby: An introduction – syntax Ruby syntax – loops Ruby: friends.each{|friend| puts friend } C: for(i=0; i<number_of_elements;i++) { print element[i] } Try it out! 10.times {|i| puts i } 10.downto(1){|i| puts i } There is no standard „for” loop in Ruby! Maciej Mensfeld 10/23 Ruby: An introduction – syntax Ruby craziness - symbols When you ask someone : what are symbols in Ruby? Most programmers will say: they simple are! A symbol in Ruby is an instance of the class Symbol. A symbol is defined by prefixing a colon with an identifier. :name, :id, :user Symbols are most commonly used in creating hashes: h = {:name => "Jayson", :email => „[email protected]"} OMG symbols are so weird… The advantage in using symbols is the efficient use of memory. Maximum space taken by a symbol is never more than the space taken by an integer. This is because internally symbol is stored as an integer. In case of strings the memory space depends on the size of the string. Maciej Mensfeld 11/23 Ruby: An introduction – syntax Ruby craziness - symbols Also whenever a string is used in the program, a new instance is created. But for symbols, same identifier points to the same memory location! puts "name".object_id puts "name".object_id puts :name.object_id puts :name.object_id Try it out! Compare: puts "name".object_id == "name".object_id puts :name.object_id == :name.object_id Maciej Mensfeld 12/23 Ruby: writing some cool stuff Web crawler! Enough theory! Let’s be pragmatic! Fetch and store urls Simple web crawler requirements Search 4 keywords (support regexp) Don’t revisit urls Print results Maciej Mensfeld 23 Ruby: writing some cool stuff Web crawler – page content parser What do we need? Maciej Mensfeld Parser Crawler Extracts data from page content Crawl all available pages 23 Ruby: writing some cool stuff Simple parser – 13LOC attr_reader – set instance variable as readonly from the outside attr_accessor – make instanca variable R/W from the outside Maciej Mensfeld 23 Ruby: writing some cool stuff Simple parser – 13LOC @keyword.is_a?(String) ? @keyword.downcase : @keyword Just like in C and PHP: condition_true ? if true do smthng : if false do smthng Try it out! Maciej Mensfeld true ? puts(„I’m true!”) : puts(„I’m false!”) false ? puts(„I’m true!”) : puts(„I’m false!”) 16/23 Ruby: writing some cool stuff Simple parser – 13LOC Try it out! Maciej Mensfeld 17/23 Ruby: writing some cool stuff Crawler – How should it work? Try to download page Select another page No Success? Yes Mark page as visited Maciej Mensfeld Do something with result (parse, send, etc) 18/23 Ruby: writing some cool stuff Crawler – 37LOC Try it out! We will check only pages with specified extensions (html, asp, php, etc) Add our start url to a @new_urls array (basicly create @new_urls array with one address in it) No pages where visited yet – so @visited_urls = [] (empty array) Maciej Mensfeld 19/23 Ruby: writing some cool stuff Crawler – read page in Ruby Reading pages in Ruby is easy! Try it out! Mark page as visited (add it to @visited_urls array) Load current content page (parse URL and address) Catch any type of exception (404, 500, etc) – mark page as visited so we don’t go there again and return false Maciej Mensfeld 20/23 Ruby: writing some cool stuff Crawler – extract URLs Reading pages in Ruby is easy! Try it out! Use URI.extract method to extract all urls from @current_content Check if URL is valid (try to parse it and check extension) If anything failes – assume that this URL is incorrect Maciej Mensfeld 21/23 Ruby: writing some cool stuff Crawler – run crawler! Try it out! Maciej Mensfeld 22/23 Ruby: writing some cool stuff THX Presented by: Maciej Mensfeld [email protected] dev.mensfeld.pl github.com/mensfeld Maciej Mensfeld 22/30