Browse Source

First commit

Juhani Krekelä 1 year ago
commit
b9e5ffa2b3
8 changed files with 346 additions and 0 deletions
  1. 4 0
      .gitignore
  2. 116 0
      CC0
  3. 45 0
      src/entry.py
  4. 19 0
      src/export_known_hosts.py
  5. 25 0
      src/hashing.py
  6. 76 0
      src/process_known_hosts.py
  7. 39 0
      src/write_file.py
  8. 22 0
      sshwot-format.text

+ 4 - 0
.gitignore

@@ -0,0 +1,4 @@
+__pycache__
+*.pyc
+*.swp
+*.sshwot

+ 116 - 0
CC0

@@ -0,0 +1,116 @@
+CC0 1.0 Universal
+
+Statement of Purpose
+
+The laws of most jurisdictions throughout the world automatically confer
+exclusive Copyright and Related Rights (defined below) upon the creator and
+subsequent owner(s) (each and all, an "owner") of an original work of
+authorship and/or a database (each, a "Work").
+
+Certain owners wish to permanently relinquish those rights to a Work for the
+purpose of contributing to a commons of creative, cultural and scientific
+works ("Commons") that the public can reliably and without fear of later
+claims of infringement build upon, modify, incorporate in other works, reuse
+and redistribute as freely as possible in any form whatsoever and for any
+purposes, including without limitation commercial purposes. These owners may
+contribute to the Commons to promote the ideal of a free culture and the
+further production of creative, cultural and scientific works, or to gain
+reputation or greater distribution for their Work in part through the use and
+efforts of others.
+
+For these and/or other purposes and motivations, and without any expectation
+of additional consideration or compensation, the person associating CC0 with a
+Work (the "Affirmer"), to the extent that he or she is an owner of Copyright
+and Related Rights in the Work, voluntarily elects to apply CC0 to the Work
+and publicly distribute the Work under its terms, with knowledge of his or her
+Copyright and Related Rights in the Work and the meaning and intended legal
+effect of CC0 on those rights.
+
+1. Copyright and Related Rights. A Work made available under CC0 may be
+protected by copyright and related or neighboring rights ("Copyright and
+Related Rights"). Copyright and Related Rights include, but are not limited
+to, the following:
+
+  i. the right to reproduce, adapt, distribute, perform, display, communicate,
+  and translate a Work;
+
+  ii. moral rights retained by the original author(s) and/or performer(s);
+
+  iii. publicity and privacy rights pertaining to a person's image or likeness
+  depicted in a Work;
+
+  iv. rights protecting against unfair competition in regards to a Work,
+  subject to the limitations in paragraph 4(a), below;
+
+  v. rights protecting the extraction, dissemination, use and reuse of data in
+  a Work;
+
+  vi. database rights (such as those arising under Directive 96/9/EC of the
+  European Parliament and of the Council of 11 March 1996 on the legal
+  protection of databases, and under any national implementation thereof,
+  including any amended or successor version of such directive); and
+
+  vii. other similar, equivalent or corresponding rights throughout the world
+  based on applicable law or treaty, and any national implementations thereof.
+
+2. Waiver. To the greatest extent permitted by, but not in contravention of,
+applicable law, Affirmer hereby overtly, fully, permanently, irrevocably and
+unconditionally waives, abandons, and surrenders all of Affirmer's Copyright
+and Related Rights and associated claims and causes of action, whether now
+known or unknown (including existing as well as future claims and causes of
+action), in the Work (i) in all territories worldwide, (ii) for the maximum
+duration provided by applicable law or treaty (including future time
+extensions), (iii) in any current or future medium and for any number of
+copies, and (iv) for any purpose whatsoever, including without limitation
+commercial, advertising or promotional purposes (the "Waiver"). Affirmer makes
+the Waiver for the benefit of each member of the public at large and to the
+detriment of Affirmer's heirs and successors, fully intending that such Waiver
+shall not be subject to revocation, rescission, cancellation, termination, or
+any other legal or equitable action to disrupt the quiet enjoyment of the Work
+by the public as contemplated by Affirmer's express Statement of Purpose.
+
+3. Public License Fallback. Should any part of the Waiver for any reason be
+judged legally invalid or ineffective under applicable law, then the Waiver
+shall be preserved to the maximum extent permitted taking into account
+Affirmer's express Statement of Purpose. In addition, to the extent the Waiver
+is so judged Affirmer hereby grants to each affected person a royalty-free,
+non transferable, non sublicensable, non exclusive, irrevocable and
+unconditional license to exercise Affirmer's Copyright and Related Rights in
+the Work (i) in all territories worldwide, (ii) for the maximum duration
+provided by applicable law or treaty (including future time extensions), (iii)
+in any current or future medium and for any number of copies, and (iv) for any
+purpose whatsoever, including without limitation commercial, advertising or
+promotional purposes (the "License"). The License shall be deemed effective as
+of the date CC0 was applied by Affirmer to the Work. Should any part of the
+License for any reason be judged legally invalid or ineffective under
+applicable law, such partial invalidity or ineffectiveness shall not
+invalidate the remainder of the License, and in such case Affirmer hereby
+affirms that he or she will not (i) exercise any of his or her remaining
+Copyright and Related Rights in the Work or (ii) assert any associated claims
+and causes of action with respect to the Work, in either case contrary to
+Affirmer's express Statement of Purpose.
+
+4. Limitations and Disclaimers.
+
+  a. No trademark or patent rights held by Affirmer are waived, abandoned,
+  surrendered, licensed or otherwise affected by this document.
+
+  b. Affirmer offers the Work as-is and makes no representations or warranties
+  of any kind concerning the Work, express, implied, statutory or otherwise,
+  including without limitation warranties of title, merchantability, fitness
+  for a particular purpose, non infringement, or the absence of latent or
+  other defects, accuracy, or the present or absence of errors, whether or not
+  discoverable, all to the greatest extent permissible under applicable law.
+
+  c. Affirmer disclaims responsibility for clearing rights of other persons
+  that may apply to the Work or any use thereof, including without limitation
+  any person's Copyright and Related Rights in the Work. Further, Affirmer
+  disclaims responsibility for obtaining any necessary consents, permissions
+  or other rights required for any use of the Work.
+
+  d. Affirmer understands and acknowledges that Creative Commons is not a
+  party to this document and has no duty or obligation with respect to this
+  CC0 or use of the Work.
+
+For more information, please see
+<http://creativecommons.org/publicdomain/zero/1.0/>

+ 45 - 0
src/entry.py

@@ -0,0 +1,45 @@
+from collections import namedtuple
+
+import hashing
+
+# Entry(bytes[32], bytes[32], bytes[32], bytes[0…2¹⁶-1])
+Entry = namedtuple('Entry', ['salt', 'hashed_host', 'fingerprint', 'comment'])
+
+class UnacceptableComment(Exception): pass
+
+def create_entry(domain, port, fingerprint, comment):
+	"""create_entry(str, u16, bytes[32], str) → Entry
+	Given unprocessed host, a binary fingerprint and a comment, creates
+	and entry describing it"""
+	assert type(domain) == str
+	assert type(port) == int and 0 <= port <= (1<<16) - 1
+	assert type(fingerprint) == bytes and len(fingerprint) == 32
+	assert type(comment) == str
+
+	# We want to have domain names reasonably normalized. This is why we
+	# convert all internationalized domain names to punycode and
+	# lowercase all domains.
+	# The reason the lowercasing happens after the punycoding is because
+	# that way we don't have to worry about Unicode case mapping: in
+	# case of IDN the IDNA codec handles that for us, and in case of an
+	# ASCII domain it passes through the IDNA unmodified
+	processed_host = domain.encode('idna').lower()
+
+	# If the port is not :22, we store [host]:port instead
+	if port != 22:
+		processed_host = b'[%s]%i' % (processed_host, port)
+
+	# Hash the host and store the salt
+	salt, hashed_host = hashing.hash_host(processed_host)
+
+	# Comment must not include newlines
+	if '\n' in comment:
+		raise UnacceptableComment('Comment contains newlines')
+
+	comment_encoded = comment.encode('utf-8')
+
+	# Comment may be at max 2¹⁶-1 bytes long
+	if len(comment_encoded) >= 1<<16:
+		raise UnacceptableComment('Comment length of %i bytes is too long' % len(comment_encoded))
+
+	return Entry(salt, hashed_host, fingerprint, comment_encoded)

+ 19 - 0
src/export_known_hosts.py

@@ -0,0 +1,19 @@
+import sys
+
+import process_known_hosts
+import write_file
+
+def main():
+	entries = []
+	# TODO: Don't hardcode
+	# TODO: Handle errors
+	with open(sys.argv[1], 'r') as f:
+		for line in f:
+			entries.extend(process_known_hosts.process_line(line))
+
+	with open('known_hosts.sshwot', 'wb') as f:
+		write_file.write(f, entries)
+
+if __name__ == '__main__':
+	main()
+

+ 25 - 0
src/hashing.py

@@ -0,0 +1,25 @@
+import hashlib
+import os
+
+def hash_with_salt(host, salt):
+	"""hash_with_salt(bytes, bytes) → bytes[32]
+	Hash the host using sha256 and the give salt"""
+	assert type(host) == bytes
+	assert type(salt) == bytes
+	m = hashlib.sha256()
+	m.update(host)
+	m.update(salt)
+	return m.digest()
+
+def generate_salt():
+	"""generate_salt() → bytes[32]
+	Generates 32 bytes of randomness using the system urandom"""
+	return os.urandom(32)
+
+def hash_host(host):
+	"""hash_host(bytes) → (bytes[32]: salt, bytes[32]: hashed_host)
+	Generates a salt and hashes the host with it"""
+	assert type(host) == bytes
+	salt = generate_salt()
+	hashed_host = hash_with_salt(host, salt)
+	return salt, hashed_host

+ 76 - 0
src/process_known_hosts.py

@@ -0,0 +1,76 @@
+import base64
+import hashlib
+
+import entry
+
+class KnownHostsSyntaxError(Exception): pass
+
+class HashedHostError(Exception): pass
+
+def process_line(line):
+	"""process_line(str) → [Entry]
+	Given a string containing one line of .ssh/known_hosts file, create
+	a list of Entries based on it."""
+	assert type(line) == str
+
+	# Remove trailing newlines
+	if line[-1] == '\n': line = line[:-1]
+
+	# Just skip over empty lines
+	if line == '': return []
+
+	# Each line has host(s), algorithm, public key, and possibly one
+	# more optional field
+	fields = line.split(' ')
+	if len(fields) != 3 and len(fields) != 4:
+		raise KnownHostsSyntaxError('Weird number of fields on a line (%i)' % len(fields))
+
+	hosts, algorithm, public_key = fields[0:3]
+
+	# Generate public key fingerprint
+	# The key is stored base64 encoded, so decode it first
+	try:
+		public_key_binary = base64.b64decode(public_key, validate = True)
+	except (ValueError, base64.binascii.Error) as err:
+		raise KnownHostsSyntaxError('Malformed public key: %s' % public_key) from err
+	
+	# Fingerprint is sha256 hash of the public key
+	m = hashlib.sha256()
+	m.update(public_key_binary)
+	fingerprint = m.digest()
+
+	# There can be several hosts separated with a comma
+	entries = []
+	for host in hosts.split(','):
+		# A host can't be empty
+		if len(host) == 0:
+			raise KnownHostsSyntaxError('An empty host')
+
+		# If the host begins with '|' it's hashed
+		# We cannot deal with those
+		if host[0] == '|':
+			raise HashedHostError('Cannot deal with hashed hosts')
+
+		# If the host behins with '[' it's a nonstandard port
+		# The format will be [domain]:port
+		# Extractt both
+		# Otherwise, default to port 22
+		if host[0] == '[':
+			host_and_port = host[1:].split(']:')
+			if len(host_and_port) != 2:
+				raise KnownHostsSyntaxError('Unrecognized host format: ' + host)
+
+			domain = host_and_port[0]
+			try:
+				port = int(host_and_port[1])
+			except ValueError:
+				raise KnownHostsSyntaxError('Malformed port: %i' % port)
+
+		else:
+			domain = host
+			port = 22
+
+		# Default to no comment
+		entries.append(entry.create_entry(domain, port, fingerprint, ''))
+
+	return entries

+ 39 - 0
src/write_file.py

@@ -0,0 +1,39 @@
+def write_header(f):
+	"""write_header(file(wb))
+	Writes the header to the given file."""
+	# b'WOT' magic
+	f.write(b'WOT')
+	# Version number
+	f.write(bytes(0))
+
+def write_entry(f, salt, hashed_host, fingerprint, comment):
+	"""write_entry(file(wb), bytes[32], bytes[32], bytes[32], bytes[0…2¹⁶-1])
+	Writes an entry to the given file."""
+	assert type(salt) == bytes and len(salt) == 32
+	assert type(hashed_host) == bytes and len(hashed_host) == 32
+	assert type(fingerprint) == bytes and len(fingerprint) == 32
+	assert type(comment) == bytes and 0 <= len(comment) <= (1<<16) - 1
+
+	# u8[32]: salt
+	f.write(salt)
+
+	# u8[32]: hashed_host
+	f.write(hashed_host)
+
+	# u8[32]: fingerprint
+	f.write(fingerprint)
+
+	# u16le: len(comment)
+	comment_len = len(comment)
+	f.write(bytes([comment_len & 0xff, comment_len >> 8]))
+
+	# u8[]: comment
+	f.write(comment)
+
+def write(f, entries):
+	"""write(file(wb), [Entry])
+	Creates a file containing all of the entries"""
+	write_header(f)
+	
+	for entry in entries:
+		write_entry(f, entry.salt, entry.hashed_host, entry.fingerprint, entry.comment)

+ 22 - 0
sshwot-format.text

@@ -0,0 +1,22 @@
+The file has a header like
+	u8[3]: magic = b'WOT'
+	u8: version = 0
+
+After the header the entries are laid out as
+	u8[32]: salt
+	u8[32]: sha256(host concat salt)
+	u8[32]: sha256-fingerprint
+	u16le: comment-bytes
+	utf8[]: comment
+
+If port is not 22, the host is [host]:port. This is in accordance with how
+OpenSSH stores it in .ssh/known_hosts. Internationalized domain names are
+punycoded and all domain names are converted into lower case. This differs
+from OpenSSH, which is not IDN-aware.
+
+Sha256 is used instead of a password hash since we want checking for whether
+a host is present to be reasonably fast.
+
+The comment field can have any other valid Unicode, but must not contain
+newline characters. An implementation should check for them when displaying
+the comment.