Subset Wiki

The c2.com wiki is too big to see in a single federated wiki site. Better to break it up into topics. My first thought was to use the existing category lables but have chosen to use frequently found title words instead. commit

The original wiki insisted on page titles made from multiple alphabetic words. This forced authors to name pages carefully if sometimes awkwardly.

✔ Study word usage in existing titles.

✔ Select subset based on subdomain for sitemap.

✔ Allow each subdomain to access all wiki content.

✔ Provide unique flag for each subdomain.

html

Wiki pages are stored in a flat-file database (.wdb). Our analysis starts by breaking these names into individual words. We count uniques and then list the largest first.

code

Every word is a potential subset. We'd like words that are both distinguishing and meaningful. We'd also like them to label 100 to 500 pages. A few words identify subsets larger than that.

code

The previous prototype offered only the most recently edited pages in the sitemap.

code

We continue to prefer more recently edited pages in the rare cases where the subset would exceed our expanded sitemap page limit of 500.

code

html

We revised the original wiki cgi script to emit json. This script also builds sitemaps based on search words passed in as subdomains. github

We created six subset categories with three to seven subdomains referenced in each one. Mostly these aligned with interests in the 1995-2000 time period.

reference

In use, one would choose one category and then start browsing from the aggregated recent changes then loaded. All categories together represents 7,000 pages.