Commit Graph

120 Commits

Author SHA1 Message Date
Jonas 'Sortie' Termansen 3c43f71084 Implement file descriptor passing.
This change refactors the Unix socket / pipe backend to have a ring buffer
containing segments, where each segment has an optional leading ancillary
buffer containing control messages followed by a normal data buffer.

The SCM_RIGHTS control message has been implemented which transfers file
descriptors to the receiving process. File descriptors are reference counted
and cycles are prevented using the following restrictions:

1) Unix sockets cannot be sent on themselves (on either end).
2) Unix sockets themselves being sent cannot be sent on.
3) Unix sockets cannot send a Unix socket being sent on.

This is a compatible ABI change.
2021-12-31 22:24:11 +01:00
Jonas 'Sortie' Termansen b9898086c6 Add file descriptor table reservations.
The file descriptor table now allows reserving room for multiple file
descriptors without assigning their numbers. This functionality means
any error conditions happen up front and the subsequent number
assignment will never fail.

This change uses the new functionality to fix troublesome error handling
when allocating multiple file descriptors. One pty allocation error path
was even wrong.

There were subtle race conditions where one (kernel) thread may have
allocated one file descriptor, and another thread spuciously replaces it
with something else, and then the second file descriptor allocation
failed in the first thread, and it closes the first file descriptor now
pointing to a different file description. This case seems harmless but
it's not a great class of bugs to exist in the first place. The new
behavior means the file descriptions appear in the file descriptor table
without fail and never needs to be cleaned up midway and is certainly
immune to shenangians from other threads.

Reviewed-by: Pedro Falcato <pedro.falcato@gmail.com>
2021-12-31 22:24:07 +01:00
Jonas 'Sortie' Termansen 5e7605fad2 Implement threading primitives that truly sleep.
The idle thread is now actually run when the system is idle because it
truly goes idle. The idle thread is made power efficient by using the hlt
instruction rather than a busy loop.

The new futex(2) system call is used to implement fast user-space mutexes,
condition variables, and semaphores. The same backend and design is used as
kutexes for truly sleeping kernel mutexes and condition variables.

The new exit_thread(2) flag EXIT_THREAD_FUTEX_WAKE wakes a futex.

Sleeping on clocks in the kernel now uses timers for true sleep.

The interrupt worker thread now truly sleeps when idle.

Kernel threads are now named.

This is a compatible ABI change.
2021-06-23 22:10:47 +02:00
Jonas 'Sortie' Termansen 4daedc31f7 Fix handling of overflow and non-canonical values in timespec APIs.
Support zero relative and absolute times in the timer API.
2021-06-22 21:48:27 +02:00
Jonas 'Sortie' Termansen 3b036b6c5d Add getdnsconfig(2) and setdnsconfig(2). 2021-06-13 23:27:52 +02:00
Jonas 'Sortie' Termansen 20c1f1d0d4 Add signal mask support to ppoll(2). 2018-12-08 22:54:28 +01:00
Jonas 'Sortie' Termansen 62bd9bf901 Fix pid 1 deadlocking when exiting with children.
The child processes of pid 1 were being reparented to pid 1, causing an
infinite loop. This change fixes the problem by adding a hook that runs in
the last thread about to exit in a process. When pid 1 exits, the hook will
prevent more processes and threads from being created, and then broadcast
kill all processes and threads. The hook is not run in LastPrayer(), as that
function runs in a worker thread and it can't block waiting for another
thread to run LastPrayer() in the same thread.
2018-08-06 23:59:35 +02:00
Jonas 'Sortie' Termansen 568c97c77f Fix SEEK_END, file offset overflow, and read/write/mkpartition syscall bugs.
Fix SEEK_END seeking twice as far as requested. Centralize lseek handling in
one place and avoid overflow bugs. Inode lseek handlers now only need to
handle SEEK_END with offset 0. Prevent the file offset from ever going below
zero or overflowing.

Character devices are now not seekable, but lseek will pretend they are, yet
always stay at the file offset 0. pread/pwrite on character devices will now
ignore the file offset and call read/write.

This change prevents character devices from being memory mapped, notably
/dev/zero can no longer be memory mapped. None of the current ports seem
to rely on this behavior and will work with just MAP_ANONYMOUS.

Refactor read and write system calls to have a shared return statement for
both seekable and non-seekable IO.

Fix file offset overflow bugs in read and write system calls.

Fix system calls returning EPERM instead of properly returning EBADF when
the file has not been opened in the right mode.

Truncate IO counts and total vector IO length so the IO operation does not
do any IO beyond OFF_MAX. Truncate also total vector IO length for recvmsg
and sendmsg. Fail with EINVAL if total vector IO length exceeds SSIZE_MAX.

Don't stop early if the total IO length is zero, so zero length IO now block
on any locks internal to the inode.

Handle reads at the maximum file offset with an end of file condition and
handle writes of at least one byte at the maximum file offset by failing
with EFBIG.

Refactor UtilMemoryBuffer to store the file size using off_t instead of
size_t to avoid casts and keep file sizes in the off_t type. Properly
handle errors in the code, such as failing with EROFS instead of EBADF if
the backing memory is not writeable, and failing with EFBIG if writing
beyond the end of the file.

Fix mkpartition not rejecting invalid partition start offsets and lengths.
Strictly enforce partition start and length checks in the partition code.
Enforce partitions exist within regular files or block devices.

Fix a few indention issues.
2017-12-04 23:56:46 +01:00
Jonas 'Sortie' Termansen 9f1965f36e Prioritize the interrupt worker thread. 2017-05-18 22:40:46 +02:00
Jonas 'Sortie' Termansen acc32ccb49 Make interrupt work thread reliable. 2017-04-12 23:22:09 +02:00
Jonas 'Sortie' Termansen ef2e478607 Implement getpeername(2) and getsockname(2). 2017-02-26 22:24:35 +01:00
Jonas 'Sortie' Termansen 4eb9caaa39 Fix non-blocking accept4(2) and getting the Unix socket peer address.
Rename the internal kernel method from accept to accept4.

fixup! Fix non-blocking accept4(2) and getting the unix socket peer address.
2017-02-26 22:24:18 +01:00
Meisaka Yukara 961ba9ec6c Add cache-aware memory mapping functions.
This commit is joint work by Meisaka Yukara <Meisaka.Yukara@gmail.com> and
Jonas 'Sortie' Termansen <sortie@maxsi.org>.
2017-02-19 12:13:32 +01:00
Meisaka Yukara 307223a5a7 Add PCI scanning functions and busmastering functions.
This commit is joint work by Meisaka Yukara <Meisaka.Yukara@gmail.com> and
Jonas 'Sortie' Termansen <sortie@maxsi.org>.
2017-02-19 12:10:59 +01:00
Jonas 'Sortie' Termansen fcefd86432 Implement shutdown(2). 2017-02-18 15:29:40 +01:00
Jonas 'Sortie' Termansen 4b2cf28bbf Add socket(2).
This removes the /dev/net socket interface.

This is an incompatible ABI change.
2017-02-14 20:43:31 +01:00
Jonas 'Sortie' Termansen a53dd5d29d Support deallocating kernel timers in timer handlers. 2017-02-14 20:43:30 +01:00
Jonas 'Sortie' Termansen 7a8a71674e Move readv/writev family and sendmsg/recvmsg into drivers. 2017-02-13 22:04:21 +01:00
Jonas 'Sortie' Termansen 0bb608b09e Support 8-bit/24-bit color and more escape codes in the graphical console.
The console has gained these escape codes:
 - Set color to any of 256 entries in the palette.
 - Set color to any 24-bit RGB value.
 - Inverse mode.
 - Bold mode.
 - Underline mode.
 - Move cursor to line N.
 - \a is now ignored.

The effectively unused ATTR_CHAR has been removed. Parsing of escape codes
has been improved. The graphical palette has been changed to the tango
colors, which makes Sortix look a bit differently. Some user-space programs
have been changed to use different colors that look better under the new
palette.

Remove const from methods that weren't really const and remove mutable
keyword workaround.
2016-11-27 11:19:03 +01:00
Jonas 'Sortie' Termansen e7c5d032d1 Refactor graphical resolution changes. 2016-11-27 11:18:48 +01:00
Jonas 'Sortie' Termansen b38c84852c Add pseudo terminals.
This is a compatible ABI change riding on the previous commit's bump.
2016-11-23 22:31:05 +01:00
Jonas 'Sortie' Termansen db7182ddc3 Add support for sessions.
This change refactors the process group implementation and adds support
for sessions. The setsid(2) and getsid(2) system calls were added.

psctl(2) now has PSCTL_TTYNAME, which lets you get the name of a process's
terminal, and ps(1) now uses it.

The initial terminal is now called /dev/tty1.

/dev/tty is now a factory for the current terminal.

A global lock now protects the process hierarchy which makes it safe to
access other processes. This refactor removes potential vulnerabilities
and increases system robustness.

A number of terminal ioctls have been added.

This is a compatible ABI change.
2016-11-23 22:30:47 +01:00
Jonas 'Sortie' Termansen d529a1e332 Add factory inode support. 2016-11-23 21:46:06 +01:00
Jonas 'Sortie' Termansen d720f16537 Add ONLCR and OCRNL.
This is a compatible ABI change.
2016-11-05 23:38:40 +01:00
Pedro Falcato 205a3e7156
Remove not_rsp and not_esp. 2016-10-30 12:03:47 +00:00
Jonas 'Sortie' Termansen 84c0844f56 Seed kernel entropy with randomness from the previous boot.
The bootloader will now load the /boot/random.seed file if it exists, in
which case the kernel will use it as the initial kernel entropy. The kernel
warns if no random seed was loaded, unless the --no-random-seed option was
given. This option is used for live environments that inherently have no
prior secret state. The kernel initializes its entropy pool from the random
seed as of the first things, so randomness is available very early on.

init(8) will emit a fresh /boot/random.seed file on boot to avoid the same
entropy being used twice. init(8) also writes out /boot/random.seed on
system shutdown where the system has the most entropy. init(8) will warn if
writing the file fails, except if /boot is a real-only filesystem, and
keeping such state is impossible. The system administrator is then
responsible for ensuring the bootloader somehow passes a fresh random seed
on the next boot.

/boot/random.seed must be owned by the root user and root group and must
have file permissions 600 to avoid unprivileged users can read it. The file
is passed to the kernel by the bootloader as a multiboot module with the
command line --random-seed.

If no random seed is loaded, the kernel attempts a poor quality fallback
where it seeds the kernel arc4random(3) continuously with the current time.
The timing variance may provide some effective entropy. There is no real
kernel entropy gathering yet. The read of the CMOS real time clock is moved
to an early point in the kernel boot, so the current time is available as
fallback entropy.

The kernel access of the random seed module is supposed to be infallible
and happens before the kernel log is set up, but there is not yet a failsafe
API for mapping single pages in the early kernel.

sysupgrade(8) creates /boot/random.seed if it's absent as a temporary
compatibility measure for people upgrading from the 1.0 release. The GRUB
port will need to be upgraded with support for /boot/random.seed in the
10_sortix script. Installation with manual bootloader configuration will
need to load the random seed with the --random-seed command line. With GRUB,
this can be done with: module /boot/random.seed --random-seed
2016-10-04 00:34:50 +02:00
Jonas 'Sortie' Termansen 8ec5d9af44 Fix linked list and shadowing bugs in kernel clock and timer code. 2016-08-21 00:04:27 +02:00
Jonas 'Sortie' Termansen 2b6463aa95 Fix drivers not detecting PCI devices without an interrupt line. 2016-08-21 00:03:58 +02:00
Jonas 'Sortie' Termansen 2e03bd94d3 Add protection against sigreturn oriented programming (SROP).
This change hardens against invalid calls to sigreturn, which is a very
useful gadget when compromising a process. The system call now verifies
it is a real return from a signal and aborts the process otherwise. This
should render such attacks impossible in threads that are not servicing a
signal, and infeasible in threads that are handling signals they are yet to
return from.

The kernel now keeps track for each thread how many signals are being
handled but haven't returned yet.

Each thread now has a random signal value. It is re-randomized when the
thread handles a signal and the current signal counter is zero. This is
xorred with the context address and used as canary on the stack during
signal dispatch, protecting the saved context on the stack. This works
mostly like the regular stack protector.

The kernel now keeps track of the stack pointer for a single handled
signal per thread. It doesn't seem worth it to keep track of multiple
handled signals, as more than one is rare. Note that each delivered signal
will not necessarily result in a sigreturn because it is valid for a thread
to longjmp(3) out of a signal handler to a valid jmp_buf.

The sigreturn system call will abort if either:

- It was not called from the kernel sigreturn page.
- The thread is not currently processing a signal.
- The thread is processing a single signal, and the stack pointer did not
  have the expected value.
- It fails to read the context on the stack.
- The canary is wrong.
2016-05-15 22:43:29 +02:00
Jonas 'Sortie' Termansen 05282c86d7 Fix fchownat(2) system call ABI on x86.
This system call has five arguments, of which one is a 64-bit uid_t, and
another is a 64-bit gid_t, which means that 7 registers are needed. However,
x86 only has 5 registers available for system calls. Wrap the system call
with a structure like with mmap(2).
2016-03-26 23:28:36 +01:00
Jonas 'Sortie' Termansen 2b72262b4f Relicense Sortix to the ISC license.
I hereby relicense all my work on Sortix under the ISC license as below.

All Sortix contributions by other people are already under this license,
are not substantial enough to be copyrightable, or have been removed.

All imported code from other projects is compatible with this license.

All GPL licensed code from other projects had previously been removed.

Copyright 2011-2016 Jonas 'Sortie' Termansen and contributors.

Permission to use, copy, modify, and distribute this software for any
purpose with or without fee is hereby granted, provided that the above
copyright notice and this permission notice appear in all copies.

THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
2016-03-05 22:21:50 +01:00
Jonas 'Sortie' Termansen 3487b62152 Remove dead MTRR code from the kernel. 2016-02-29 01:37:31 +01:00
Jonas 'Sortie' Termansen ede0571926 Add UTIME_NOW and UTIME_OMIT. 2016-02-24 17:32:05 +01:00
Jonas 'Sortie' Termansen 02c6316e95 Remove kernel debugger, old kernel US layout and kernel symbol code.
The debugger has fallen behind and has become a maintenance burden.  It was
the only user of the old kernel US layout system, which is good to get rid
of.  The debugger didn't work with graphical output and was likely to
conflict with the new keyboard system if used, which no longer triggered it.
The kernel symbol code was removed to simplify the kernel.

The kernel debugger was an useful debugging feature, but it needs to be done
in a better way before it can be added back.
2016-02-22 00:12:26 +01:00
Jonas 'Sortie' Termansen 475bd7c26e Add support for multiple initrds. 2016-02-07 14:48:27 +01:00
Jonas 'Sortie' Termansen 79e01c2eba Rewrite ATA driver. 2016-02-02 22:47:49 +01:00
Jonas 'Sortie' Termansen 2e4b15daed Simplify directory reading. 2016-01-26 18:42:54 +01:00
Jonas 'Sortie' Termansen bff1265d62 Add termios(2). 2016-01-25 15:47:40 +01:00
Jonas 'Sortie' Termansen 8f233b4a10 Add console backspace bold and underline support.
Combine textbuffer char and attr concepts while here.
2016-01-23 01:02:50 +01:00
Jonas 'Sortie' Termansen 306709fc4a Add PS/2 controller driver. 2016-01-23 00:50:53 +01:00
Jonas 'Sortie' Termansen ff8b2be515 Implement CLOCK_THREAD_CPUTIME_ID and CLOCK_THREAD_SYSTIME_ID. 2016-01-09 02:28:44 +01:00
Jonas 'Sortie' Termansen af9cc8ed05 Schedule full console redraw after user-space framebuffer write. 2016-01-08 19:56:11 +01:00
Jonas 'Sortie' Termansen dad5c57f33 Allow bootloader bitmap framebuffer modesetting. 2016-01-08 19:56:11 +01:00
Jonas 'Sortie' Termansen 8c7c6fa59f Center ascii cat on boot. 2016-01-08 19:56:11 +01:00
Jonas 'Sortie' Termansen 22351d7f72 Fix untimely delivery of signals during userfs reference count messages. 2016-01-07 19:08:43 +01:00
Jonas 'Sortie' Termansen 559857b97e Fix features.h inclusions not yet changed to sys/cdefs.h. 2015-12-23 17:49:59 +01:00
Jonas 'Sortie' Termansen f60b2c6ec4 Add keyboard layout support to kernel. 2015-12-19 02:44:15 +01:00
Jonas 'Sortie' Termansen 4b6b06bbc8 Add scram(2). 2015-12-12 22:53:07 +01:00
Jonas 'Sortie' Termansen 0045f18c81 Remove kernel Scheduler::Init(). 2015-12-12 19:28:07 +01:00
Jonas 'Sortie' Termansen cee24359d8 Add psctl(2). 2015-12-12 19:28:07 +01:00