Module: Spidr::Events

Defined in:
lib/spidr/events.rb

Instance Method Summary

Instance Method Details

- (Object) all_headers(&block) {|headers| ... }

Pass the headers from every response the agent receives to a given block.

Yields:

  • (headers) — The block will be passed the headers of every response.

Yield Parameters:

  • (Hash) headers — The headers from a response.


69
70
71
# File 'lib/spidr/events.rb', line 69

def all_headers(&block)
  every_page { |page| block.call(page.headers) }
end

- (Object) every_atom_doc(&block) {|doc| ... }

Pass every Atom document that the agent parses to a given block.

Yields:

  • (doc) — The block will be passed every Atom document parsed.

Yield Parameters:

  • (Nokogiri::XML::Document) doc — A parsed XML document.

See Also:



388
389
390
391
392
393
394
395
396
# File 'lib/spidr/events.rb', line 388

def every_atom_doc(&block)
  every_page do |page|
    if (block && page.atom?)
      if (doc = page.doc)
        block.call(doc)
      end
    end
  end
end

- (Object) every_atom_page(&block) {|feed| ... }

Pass every Atom feed that the agent visits to a given block.

Yields:

  • (feed) — The block will be passed every Atom feed visited.

Yield Parameters:

  • (Page) feed — A visited page.


452
453
454
455
456
# File 'lib/spidr/events.rb', line 452

def every_atom_page(&block)
  every_page do |page|
    block.call(page) if (block && page.atom?)
  end
end

- (Object) every_bad_request_page(&block) {|page| ... }

Pass every Bad Request page that the agent visits to a given block.

Yields:

  • (page) — The block will be passed every Bad Request page visited.

Yield Parameters:

  • (Page) page — A visited page.


141
142
143
144
145
# File 'lib/spidr/events.rb', line 141

def every_bad_request_page(&block)
  every_page do |page|
    block.call(page) if (block && page.bad_request?)
  end
end

- (Object) every_css_page(&block) {|page| ... }

Pass every CSS page that the agent visits to a given block.

Yields:

  • (page) — The block will be passed every CSS page visited.

Yield Parameters:

  • (Page) page — A visited page.


422
423
424
425
426
# File 'lib/spidr/events.rb', line 422

def every_css_page(&block)
  every_page do |page|
    block.call(page) if (block && page.css?)
  end
end

- (Object) every_doc(&block) {|doc| ... }

Pass every HTML or XML document that the agent parses to a given block.

Yields:

  • (doc) — The block will be passed every HTML or XML document parsed.

Yield Parameters:

  • (Nokogiri::HTML::Document, Nokogiri::XML::Document) doc — A parsed HTML or XML document.

See Also:



282
283
284
285
286
287
288
289
290
# File 'lib/spidr/events.rb', line 282

def every_doc(&block)
  every_page do |page|
    if block
      if (doc = page.doc)
        block.call(doc)
      end
    end
  end
end

- (Object) every_failed_url(&block) {|url| ... }

Pass each URL that could not be requested to the given block.

Yields:

  • (url) — The block will be passed every URL that could not be requested.

Yield Parameters:

  • (URI::HTTP) url — A failed URL.


36
37
38
39
# File 'lib/spidr/events.rb', line 36

def every_failed_url(&block)
  @every_failed_url_blocks << block
  return self
end

- (Object) every_forbidden_page(&block) {|page| ... }

Pass every Forbidden page that the agent visits to a given block.

Yields:

  • (page) — The block will be passed every Forbidden page visited.

Yield Parameters:

  • (Page) page — A visited page.


171
172
173
174
175
# File 'lib/spidr/events.rb', line 171

def every_forbidden_page(&block)
  every_page do |page|
    block.call(page) if (block && page.forbidden?)
  end
end

- (Object) every_html_doc(&block) {|doc| ... }

Pass every HTML document that the agent parses to a given block.

Yields:

  • (doc) — The block will be passed every HTML document parsed.

Yield Parameters:

  • (Nokogiri::HTML::Document) doc — A parsed HTML document.

See Also:



303
304
305
306
307
308
309
310
311
# File 'lib/spidr/events.rb', line 303

def every_html_doc(&block)
  every_page do |page|
    if (block && page.html?)
      if (doc = page.doc)
        block.call(doc)
      end
    end
  end
end

- (Object) every_html_page(&block) {|page| ... }

Pass every HTML page that the agent visits to a given block.

Yields:

  • (page) — The block will be passed every HTML page visited.

Yield Parameters:

  • (Page) page — A visited page.


232
233
234
235
236
# File 'lib/spidr/events.rb', line 232

def every_html_page(&block)
  every_page do |page|
    block.call(page) if (block && page.html?)
  end
end

- (Object) every_internal_server_error_page(&block) {|page| ... }

Pass every Internal Server Error page that the agent visits to a given block.

Yields:

  • (page) — The block will be passed every Internal Server Error page visited.

Yield Parameters:

  • (Page) page — A visited page.


202
203
204
205
206
# File 'lib/spidr/events.rb', line 202

def every_internal_server_error_page(&block)
  every_page do |page|
    block.call(page) if (block && page.had_internal_server_error?)
  end
end

- (Object) every_javascript_page(&block) {|page| ... }

Pass every JavaScript page that the agent visits to a given block.

Yields:

  • (page) — The block will be passed every JavaScript page visited.

Yield Parameters:

  • (Page) page — A visited page.


407
408
409
410
411
# File 'lib/spidr/events.rb', line 407

def every_javascript_page(&block)
  every_page do |page|
    block.call(page) if (block && page.javascript?)
  end
end

- (Object) every_missing_page(&block) {|page| ... }

Pass every Missing page that the agent visits to a given block.

Yields:

  • (page) — The block will be passed every Missing page visited.

Yield Parameters:

  • (Page) page — A visited page.


186
187
188
189
190
# File 'lib/spidr/events.rb', line 186

def every_missing_page(&block)
  every_page do |page|
    block.call(page) if (block && page.missing?)
  end
end

- (Object) every_ms_word_page(&block) {|page| ... }

Pass every MS Word page that the agent visits to a given block.

Yields:

  • (page) — The block will be passed every MS Word page visited.

Yield Parameters:

  • (Page) page — A visited page.


467
468
469
470
471
# File 'lib/spidr/events.rb', line 467

def every_ms_word_page(&block)
  every_page do |page|
    block.call(page) if (block && page.ms_word?)
  end
end

- (Object) every_ok_page(&block) {|page| ... }

Pass every OK page that the agent visits to a given block.

Yields:

  • (page) — The block will be passed every OK page visited.

Yield Parameters:

  • (Page) page — A visited page.


96
97
98
99
100
# File 'lib/spidr/events.rb', line 96

def every_ok_page(&block)
  every_page do |page|
    block.call(page) if (block && page.ok?)
  end
end

- (Object) every_page(&block) {|page| ... }

Pass every page that the agent visits to a given block.

Yields:

  • (page) — The block will be passed every page visited.

Yield Parameters:

  • (Page) page — A visited page.


82
83
84
85
# File 'lib/spidr/events.rb', line 82

def every_page(&block)
  @every_page_blocks << block
  return self
end

- (Object) every_pdf_page(&block) {|page| ... }

Pass every PDF page that the agent visits to a given block.

Yields:

  • (page) — The block will be passed every PDF page visited.

Yield Parameters:

  • (Page) page — A visited page.


482
483
484
485
486
# File 'lib/spidr/events.rb', line 482

def every_pdf_page(&block)
  every_page do |page|
    block.call(page) if (block && page.pdf?)
  end
end

- (Object) every_redirect_page(&block) {|page| ... }

Pass every Redirect page that the agent visits to a given block.

Yields:

  • (page) — The block will be passed every Redirect page visited.

Yield Parameters:

  • (Page) page — A visited page.


111
112
113
114
115
# File 'lib/spidr/events.rb', line 111

def every_redirect_page(&block)
  every_page do |page|
    block.call(page) if (block && page.redirect?)
  end
end

- (Object) every_rss_doc(&block) {|doc| ... }

Pass every RSS document that the agent parses to a given block.

Yields:

  • (doc) — The block will be passed every RSS document parsed.

Yield Parameters:

  • (Nokogiri::XML::Document) doc — A parsed XML document.

See Also:



367
368
369
370
371
372
373
374
375
# File 'lib/spidr/events.rb', line 367

def every_rss_doc(&block)
  every_page do |page|
    if (block && page.rss?)
      if (doc = page.doc)
        block.call(doc)
      end
    end
  end
end

- (Object) every_rss_page(&block) {|feed| ... }

Pass every RSS feed that the agent visits to a given block.

Yields:

  • (feed) — The block will be passed every RSS feed visited.

Yield Parameters:

  • (Page) feed — A visited page.


437
438
439
440
441
# File 'lib/spidr/events.rb', line 437

def every_rss_page(&block)
  every_page do |page|
    block.call(page) if (block && page.rss?)
  end
end

- (Object) every_timedout_page(&block) {|page| ... }

Pass every Timeout page that the agent visits to a given block.

Yields:

  • (page) — The block will be passed every Timeout page visited.

Yield Parameters:

  • (Page) page — A visited page.


126
127
128
129
130
# File 'lib/spidr/events.rb', line 126

def every_timedout_page(&block)
  every_page do |page|
    block.call(page) if (block && page.timedout?)
  end
end

- (Object) every_txt_page(&block) {|page| ... }

Pass every Plain Text page that the agent visits to a given block.

Yields:

  • (page) — The block will be passed every Plain Text page visited.

Yield Parameters:

  • (Page) page — A visited page.


217
218
219
220
221
# File 'lib/spidr/events.rb', line 217

def every_txt_page(&block)
  every_page do |page|
    block.call(page) if (block && page.txt?)
  end
end

- (Object) every_unauthorized_page(&block) {|page| ... }

Pass every Unauthorized page that the agent visits to a given block.

Yields:

  • (page) — The block will be passed every Unauthorized page visited.

Yield Parameters:

  • (Page) page — A visited page.


156
157
158
159
160
# File 'lib/spidr/events.rb', line 156

def every_unauthorized_page(&block)
  every_page do |page|
    block.call(page) if (block && page.unauthorized?)
  end
end

- (Object) every_url(&block) {|url| ... }

Pass each URL from each page visited to the given block.

Yields:

  • (url) — The block will be passed every URL from every page visited.

Yield Parameters:

  • (URI::HTTP) url — Each URL from each page visited.


22
23
24
25
# File 'lib/spidr/events.rb', line 22

def every_url(&block)
  @every_url_blocks << block
  return self
end

- (Object) every_xml_doc(&block) {|doc| ... }

Pass every XML document that the agent parses to a given block.

Yields:

  • (doc) — The block will be passed every XML document parsed.

Yield Parameters:

  • (Nokogiri::XML::Document) doc — A parsed XML document.

See Also:



324
325
326
327
328
329
330
331
332
# File 'lib/spidr/events.rb', line 324

def every_xml_doc(&block)
  every_page do |page|
    if (block && page.xml?)
      if (doc = page.doc)
        block.call(doc)
      end
    end
  end
end

- (Object) every_xml_page(&block) {|page| ... }

Pass every XML page that the agent visits to a given block.

Yields:

  • (page) — The block will be passed every XML page visited.

Yield Parameters:

  • (Page) page — A visited page.


247
248
249
250
251
# File 'lib/spidr/events.rb', line 247

def every_xml_page(&block)
  every_page do |page|
    block.call(page) if (block && page.xml?)
  end
end

- (Object) every_xsl_doc(&block) {|doc| ... }

Pass every XML Stylesheet (XSL) that the agent parses to a given block.

Yields:

  • (doc) — The block will be passed every XSL Stylesheet (XSL) parsed.

Yield Parameters:

  • (Nokogiri::XML::Document) doc — A parsed XML document.

See Also:



346
347
348
349
350
351
352
353
354
# File 'lib/spidr/events.rb', line 346

def every_xsl_doc(&block)
  every_page do |page|
    if (block && page.xsl?)
      if (doc = page.doc)
        block.call(doc)
      end
    end
  end
end

- (Object) every_xsl_page(&block) {|page| ... }

Pass every XML Stylesheet (XSL) page that the agent visits to a given block.

Yields:

  • (page) — The block will be passed every XML Stylesheet (XSL) page visited.

Yield Parameters:

  • (Page) page — A visited page.


263
264
265
266
267
# File 'lib/spidr/events.rb', line 263

def every_xsl_page(&block)
  every_page do |page|
    block.call(page) if (block && page.xsl?)
  end
end

- (Object) every_zip_page(&block) {|page| ... }

Pass every ZIP page that the agent visits to a given block.

Yields:

  • (page) — The block will be passed every ZIP page visited.

Yield Parameters:

  • (Page) page — A visited page.


497
498
499
500
501
# File 'lib/spidr/events.rb', line 497

def every_zip_page(&block)
  every_page do |page|
    block.call(page) if (block && page.zip?)
  end
end

- (Events) initialize(options = {})

A new instance of Events



3
4
5
6
7
8
9
10
11
# File 'lib/spidr/events.rb', line 3

def initialize(options={})
  super(options)

  @every_url_blocks = []
  @every_failed_url_blocks = []
  @urls_like_blocks = Hash.new { |hash,key| hash[key] = [] }

  @every_page_blocks = []
end

- (Object) urls_like(pattern, &block) {|url| ... }

Pass every URL that the agent visits, and matches a given pattern, to a given block.

Parameters:

  • (Regexp, String) pattern — The pattern to match URLs with.

Yields:

  • (url) — The block will be passed every URL that matches the given pattern.

Yield Parameters:

  • (URI::HTTP) url — A matching URL.


54
55
56
57
# File 'lib/spidr/events.rb', line 54

def urls_like(pattern,&block)
  @urls_like_blocks[pattern] << block
  return self
end