Robots

robots.txt parsing, generation, URL permission checking, and crawl configuration

Package: @robinpath/robots | Category: Web | Type: Utility

Authentication

No authentication required. All functions are available immediately.

Use Cases

Use the robots module when you need to:

Parse robots.txt content -- Use robots.parse to perform this operation
Generate robots.txt -- Use robots.create to perform this operation
Check if URL is allowed -- Use robots.isAllowed to perform this operation
Check if URL is disallowed -- Use robots.isDisallowed to perform this operation
Get crawl delay for agent -- Use robots.getCrawlDelay to perform this operation

Quick Reference

Function	Description	Returns
`parse`	Parse robots.txt content	`{rules[], sitemaps[], host}`
`create`	Generate robots.txt	`robots.txt content`
`isAllowed`	Check if URL is allowed	`true if allowed`
`isDisallowed`	Check if URL is disallowed	`true if disallowed`
`getCrawlDelay`	Get crawl delay for agent	`Delay in seconds or null`
`getSitemaps`	Get sitemap URLs	`Sitemap URLs`
`getRules`	Get rules for agent	`Matching rules`
`addRule`	Add rule to robots config	`Updated config`
`removeRule`	Remove rule by user agent	`Updated config`
`addSitemap`	Add sitemap URL	`Updated config`
`allowAll`	Generate allow-all robots.txt	`robots.txt`
`disallowAll`	Generate disallow-all robots.txt	`robots.txt`
`fetch`	Fetch and parse robots.txt from URL	`{found, status, parsed, raw}`

Functions

parse

Parse robots.txt content

Module: robots | Returns: object -- {rules[], sitemaps[], host}

robots.parse $text

Parameter	Type	Required	Description
`text`	`string`	Yes	robots.txt content

create

Generate robots.txt

Module: robots | Returns: string -- robots.txt content

robots.create {"rules": [{"userAgent": "*", "disallow": ["/admin"]}]}

Parameter	Type	Required	Description
`config`	`object`	Yes	{rules[], sitemaps[], host}

isAllowed

Check if URL is allowed

Module: robots | Returns: boolean -- true if allowed

robots.isAllowed $robots "/page" "Googlebot"

Parameter	Type	Required	Description
`robots`	`object`	Yes	Parsed robots or raw text
`url`	`string`	Yes	URL or path
`userAgent`	`string`	No	Bot name (default *)

isDisallowed

Check if URL is disallowed

Module: robots | Returns: boolean -- true if disallowed

robots.isDisallowed $robots "/admin"

Parameter	Type	Required	Description
`robots`	`object`	Yes	Parsed robots or raw text
`url`	`string`	Yes	URL or path
`userAgent`	`string`	No	Bot name

getCrawlDelay

Get crawl delay for agent

Module: robots | Returns: number -- Delay in seconds or null

robots.getCrawlDelay $robots "Googlebot"

Parameter	Type	Required	Description
`robots`	`object`	Yes	Parsed robots
`userAgent`	`string`	No	Bot name

getSitemaps

Get sitemap URLs

Module: robots | Returns: array -- Sitemap URLs

robots.getSitemaps $robots

Parameter	Type	Required	Description
`robots`	`object`	Yes	Parsed robots

getRules

Get rules for agent

Module: robots | Returns: array -- Matching rules

robots.getRules $robots "Googlebot"

Parameter	Type	Required	Description
`robots`	`object`	Yes	Parsed robots
`userAgent`	`string`	No	Bot name (all if omitted)

addRule

Add rule to robots config

Module: robots | Returns: object -- Updated config

robots.addRule $robots {"userAgent": "BadBot", "disallow": ["/"]}

Parameter	Type	Required	Description
`robots`	`object`	Yes	Parsed robots
`rule`	`object`	Yes	{userAgent, allow[], disallow[], crawlDelay}

removeRule

Remove rule by user agent

Module: robots | Returns: object -- Updated config

robots.removeRule $robots "BadBot"

Parameter	Type	Required	Description
`robots`	`object`	Yes	Parsed robots
`userAgent`	`string`	Yes	Bot name

addSitemap

Add sitemap URL

Module: robots | Returns: object -- Updated config

robots.addSitemap $robots "https://example.com/sitemap.xml"

Parameter	Type	Required	Description
`robots`	`object`	Yes	Parsed robots
`url`	`string`	Yes	Sitemap URL

allowAll

Generate allow-all robots.txt

Module: robots | Returns: string -- robots.txt

robots.allowAll

Parameter	Type	Required	Description
`userAgent`	`string`	No	Bot name (default *)

disallowAll

Generate disallow-all robots.txt

Module: robots | Returns: string -- robots.txt

robots.disallowAll

Parameter	Type	Required	Description
`userAgent`	`string`	No	Bot name (default *)

fetch

Fetch and parse robots.txt from URL

Module: robots | Returns: object -- {found, status, parsed, raw}

robots.fetch "https://example.com"

Parameter	Type	Required	Description
`url`	`string`	Yes	Site URL

Error Handling

All functions throw on failure. Common errors:

Error	Cause
(standard errors)	Check function parameters and authentication

@desc "Parse and validate result"
do
  set $result as robots.parse $text
  if $result != null
    print "Success"
  else
    print "No result"
  end
enddo

Recipes

1. List and iterate CrawlDelay

Retrieve all items and loop through them.

@desc "Get crawl delay and iterate results"
do
  set $result as robots.getCrawlDelay $robots "Googlebot"
  each $item in $result
    print $item
  end
enddo

2. Create a new item with create

Create a new resource and capture the result.

set $result as robots.create {"rules": [{"userAgent": "*", "disallow": ["/admin"]}]}
print "Created: " + $result

3. Check before creating

List existing items and only create if needed.

@desc "Get crawl delay and create"
do
  set $existing as robots.getCrawlDelay $robots "Googlebot"
  if $existing == null
    robots.create {"rules": [{"userAgent": "*", "disallow": ["/admin"]}]}
    print "Item created"
  else
    print "Item already exists"
  end
enddo

4. Multi-step Robots workflow

Chain multiple robots operations together.

@desc "Parse, create, and more"
do
  set $r_parse as robots.parse $text
  set $r_create as robots.create {"rules": [{"userAgent": "*", "disallow": ["/admin"]}]}
  set $r_isAllowed as robots.isAllowed $robots "/page" "Googlebot"
  print "All operations complete"
enddo

5. Safe parse with validation

Check results before proceeding.

@desc "Parse and validate result"
do
  set $result as robots.parse $text
  if $result != null
    print "Success: " + $result
  else
    print "Operation returned no data"
  end
enddo

Related Modules

json -- JSON module for complementary functionality

@robinpath/robots

Robots

Authentication

Use Cases

Quick Reference

Functions

parse

create

isAllowed

isDisallowed

getCrawlDelay

getSitemaps

getRules

addRule

removeRule

addSitemap

allowAll

disallowAll

fetch

Error Handling

Recipes

1. List and iterate CrawlDelay

2. Create a new item with create

3. Check before creating

4. Multi-step Robots workflow

5. Safe parse with validation

Related Modules

Versions (1)

Related Modules

ftp

http

form

api

Collaborators

Keywords

Category