sortix-mirror/sortix/scheduler.cpp
Jonas 'Sortie' Termansen 51e3de971c Multithreaded kernel and improvement of signal handling.
Pardon the big ass-commit, this took months to develop and debug and the
refactoring got so far that a clean merge became impossible. The good news
is that this commit does quite a bit of cleaning up and generally improves
the kernel quality.

This makes the kernel fully pre-emptive and multithreaded. This was done
by rewriting the interrupt code, the scheduler, introducing new threading
primitives, and rewriting large parts of the kernel. During the past few
commits the kernel has had its device drivers thread secured; this commit
thread secures large parts of the core kernel. There still remains some
parts of the kernel that is _not_ thread secured, but this is not a problem
at this point. Each user-space thread has an associated kernel stack that
it uses when it goes into kernel mode. This stack is by default 8 KiB since
that value works for me and is also used by Linux. Strange things tends to
happen on x86 in case of a stack overflow - there is no ideal way to catch
such a situation right now.

The system call conventions were changed, too. The %edx register is now
used to provide the errno value of the call, instead of the kernel writing
it into a registered global variable. The system call code has also been
updated to better reflect the native calling conventions: not all registers
have to be preserved. This makes system calls faster and simplifies the
assembly. In the kernel, there is no longer the event.h header or the hacky
method of 'resuming system calls' that closely resembles cooperative
multitasking. If a system call wants to block, it should just block.

The signal handling was also improved significantly. At this point, signals
cannot interrupt kernel threads (but can always interrupt user-space threads
if enabled), which introduces some problems with how a SIGINT could
interrupt a blocking read, for instance. This commit introduces and uses a
number of new primitives such as kthread_lock_mutex_signal() that attempts
to get the lock but fails if a signal is pending. In this manner, the kernel
is safer as kernel threads cannot be shut down inconveniently, but in return
for complexity as blocking operations must check they if they should fail.

Process exiting has also been refactored significantly. The _exit(2) system
call sets the exit code and sends SIGKILL to all the threads in the process.
Once all the threads have cleaned themselves up and exited, a worker thread
calls the process's LastPrayer() method that unmaps memory, deletes the
address space, notifies the parent, etc. This provides a very robust way to
terminate processes as even half-constructed processes (during a failing fork
for instance) can be gracefully terminated.

I have introduced a number of kernel threads to help avoid threading problems
and simplify kernel design. For instance, there is now a functional generic
kernel worker thread that any kernel thread can schedule jobs for. Interrupt
handlers run with interrupts off (hence they cannot call kthread_ functions
as it may deadlock the system if another thread holds the lock) therefore
they cannot use the standard kernel worker threads. Instead, they use a
special purpose interrupt worker thread that works much like the generic one
expect that interrupt handlers can safely queue work with interrupts off.
Note that this also means that interrupt handlers cannot allocate memory or
print to the kernel log/screen as such mechanisms uses locks. I'll introduce
a lock free algorithm for such cases later on.

The boot process has also changed. The original kernel init thread in
kernel.cpp creates a new bootstrap thread and becomes the system idle thread.
Note that pid=0 now means the kernel, as there is no longer a system idle
process. The bootstrap thread launches all the kernel worker threads and then
creates a new process and loads /bin/init into it and then creates a thread
in pid=1, which starts the system. The bootstrap thread then quietly waits
for pid=1 to exit after which it shuts down/reboots/panics the system.

In general, the introduction of race conditions and dead locks have forced me
to revise a lot of the design and make sure it was thread secure. Since early
parts of the kernel was quite hacky, I had to refactor such code. So it seems
that the risk of dead locks forces me to write better code.

Note that a real preemptive multithreaded kernel simplifies the construction
of blocking system calls. My hope is that this will trigger a clean up of
the filesystem code that current is almost beyond repair.

Almost all of the kernel was modified during this refactoring. To the extent
possible, these changes have been backported to older non-multithreaded
kernel, but many changes were tightly coupled and went into this commit.

Of interest is the implementation of the kthread_ api based on the design
of pthreads; this library allows easy synchronization mechanisms and
includes C++-style scoped locks. This commit also introduces new worker
threads and tested mechanisms for interrupt handlers to schedule work in a
kernel worker thread.

A lot of code have been rewritten from scratch and has become a lot more
stable and correct.

Share and enjoy!
2012-09-08 18:45:41 +02:00

294 lines
8.5 KiB
C++

/*******************************************************************************
Copyright(C) Jonas 'Sortie' Termansen 2011, 2012.
This file is part of Sortix.
Sortix is free software: you can redistribute it and/or modify it under the
terms of the GNU General Public License as published by the Free Software
Foundation, either version 3 of the License, or (at your option) any later
version.
Sortix is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
FOR A PARTICULAR PURPOSE. See the GNU General Public License for more
details.
You should have received a copy of the GNU General Public License along with
Sortix. If not, see <http://www.gnu.org/licenses/>.
scheduler.cpp
Decides the order to execute threads in and switching between them.
*******************************************************************************/
#include <sortix/kernel/platform.h>
#include <sortix/kernel/memorymanagement.h>
#include <libmaxsi/error.h>
#include <libmaxsi/memory.h>
#include "x86-family/gdt.h"
#include "syscall.h"
#include "interrupt.h"
#include "time.h"
#include "thread.h"
#include "process.h"
#include "signal.h"
#include "scheduler.h"
using namespace Maxsi;
namespace Sortix {
namespace Scheduler {
static Thread* currentthread;
} // namespace Scheduler
Thread* CurrentThread() { return Scheduler::currentthread; }
Process* CurrentProcess() { return CurrentThread()->process; }
namespace Scheduler {
uint8_t dummythreaddata[sizeof(Thread)] __attribute__ ((aligned (8)));
Thread* dummythread;
Thread* idlethread;
Thread* firstrunnablethread;
Thread* firstsleepingthread;
Process* initprocess;
static inline void SetCurrentThread(Thread* newcurrentthread)
{
currentthread = newcurrentthread;
}
void LogBeginSwitch(Thread* current, const CPU::InterruptRegisters* regs);
void LogSwitch(Thread* current, Thread* next);
void LogEndSwitch(Thread* current, const CPU::InterruptRegisters* regs);
static Thread* PopNextThread()
{
if ( !firstrunnablethread ) { return idlethread; }
Thread* result = firstrunnablethread;
firstrunnablethread = firstrunnablethread->schedulerlistnext;
return result;
}
static Thread* ValidatedPopNextThread()
{
Thread* nextthread = PopNextThread();
if ( !nextthread ) { Panic("Had no thread to switch to."); }
if ( nextthread->terminated )
{
PanicF("Running a terminated thread 0x%p", nextthread);
}
addr_t newaddrspace = nextthread->process->addrspace;
if ( !Page::IsAligned(newaddrspace) )
{
PanicF("Thread 0x%p, process %i (0x%p) (backup: %i), had bad "
"address space variable: 0x%zx: not page-aligned "
"(backup: 0x%zx)\n", nextthread,
nextthread->process->pid, nextthread->process,
-1/*nextthread->pidbackup*/, newaddrspace,
(addr_t)-1 /*nextthread->addrspacebackup*/);
}
return nextthread;
}
static void DoActualSwitch(CPU::InterruptRegisters* regs)
{
Thread* current = CurrentThread();
LogBeginSwitch(current, regs);
Thread* next = ValidatedPopNextThread();
LogSwitch(current, next);
if ( current == next ) { return; }
current->SaveRegisters(regs);
next->LoadRegisters(regs);
addr_t newaddrspace = next->addrspace;
Memory::SwitchAddressSpace(newaddrspace);
SetCurrentThread(next);
addr_t stacklower = next->kernelstackpos;
size_t stacksize = next->kernelstacksize;
addr_t stackhigher = stacklower + stacksize;
ASSERT(stacklower && stacksize && stackhigher);
GDT::SetKernelStack(stacklower, stacksize, stackhigher);
LogEndSwitch(next, regs);
}
void Switch(CPU::InterruptRegisters* regs)
{
DoActualSwitch(regs);
if ( regs->signal_pending && regs->InUserspace() )
Signal::Dispatch(regs);
}
const bool DEBUG_BEGINCTXSWITCH = false;
const bool DEBUG_CTXSWITCH = false;
const bool DEBUG_ENDCTXSWITCH = false;
void LogBeginSwitch(Thread* current, const CPU::InterruptRegisters* regs)
{
bool alwaysdebug = false;
bool isidlethread = current == idlethread;
bool dodebug = DEBUG_BEGINCTXSWITCH && !isidlethread;
if ( alwaysdebug || dodebug )
{
Log::PrintF("Switching from 0x%p", current);
regs->LogRegisters();
Log::Print("\n");
}
}
void LogSwitch(Thread* current, Thread* next)
{
bool alwaysdebug = false;
bool different = current == idlethread;
bool dodebug = DEBUG_CTXSWITCH && different;
if ( alwaysdebug || dodebug )
{
Log::PrintF("switching from %u:%u (0x%p) to %u:%u (0x%p) \n",
current->process->pid, 0, current,
next->process->pid, 0, next);
}
}
void LogEndSwitch(Thread* current, const CPU::InterruptRegisters* regs)
{
bool alwaysdebug = false;
bool isidlethread = current == idlethread;
bool dodebug = DEBUG_BEGINCTXSWITCH && !isidlethread;
if ( alwaysdebug || dodebug )
{
Log::PrintF("Switched to 0x%p", current);
regs->LogRegisters();
Log::Print("\n");
}
}
static void InterruptYieldCPU(CPU::InterruptRegisters* regs, void* /*user*/)
{
Switch(regs);
}
static void ThreadExitCPU(CPU::InterruptRegisters* regs, void* /*user*/)
{
SetThreadState(currentthread, Thread::State::DEAD);
InterruptYieldCPU(regs, NULL);
}
// The idle thread serves no purpose except being an infinite loop that does
// nothing, which is only run when the system has nothing to do.
void SetIdleThread(Thread* thread)
{
ASSERT(!idlethread);
idlethread = thread;
SetThreadState(thread, Thread::State::NONE);
SetCurrentThread(thread);
}
void SetDummyThreadOwner(Process* process)
{
dummythread->process = process;
}
void SetInitProcess(Process* init)
{
initprocess = init;
}
Process* GetInitProcess()
{
return initprocess;
}
void SetThreadState(Thread* thread, Thread::State state)
{
bool wasenabled = Interrupt::SetEnabled(false);
// Remove the thread from the list of runnable threads.
if ( thread->state == Thread::State::RUNNABLE &&
state != Thread::State::RUNNABLE )
{
if ( thread == firstrunnablethread ) { firstrunnablethread = thread->schedulerlistnext; }
if ( thread == firstrunnablethread ) { firstrunnablethread = NULL; }
ASSERT(thread->schedulerlistprev);
ASSERT(thread->schedulerlistnext);
thread->schedulerlistprev->schedulerlistnext = thread->schedulerlistnext;
thread->schedulerlistnext->schedulerlistprev = thread->schedulerlistprev;
thread->schedulerlistprev = NULL;
thread->schedulerlistnext = NULL;
}
// Insert the thread into the scheduler's carousel linked list.
if ( thread->state != Thread::State::RUNNABLE &&
state == Thread::State::RUNNABLE )
{
if ( firstrunnablethread == NULL ) { firstrunnablethread = thread; }
thread->schedulerlistprev = firstrunnablethread->schedulerlistprev;
thread->schedulerlistnext = firstrunnablethread;
firstrunnablethread->schedulerlistprev = thread;
thread->schedulerlistprev->schedulerlistnext = thread;
}
thread->state = state;
ASSERT(thread->state != Thread::State::RUNNABLE || thread->schedulerlistprev);
ASSERT(thread->state != Thread::State::RUNNABLE || thread->schedulerlistnext);
Interrupt::SetEnabled(wasenabled);
}
Thread::State GetThreadState(Thread* thread)
{
return thread->state;
}
void SysSleep(size_t secs)
{
uintmax_t timetosleep = ((uintmax_t) secs) * 1000ULL * 1000ULL;
uint32_t wakeat = Time::MicrosecondsSinceBoot() + timetosleep;
do { Yield(); }
while ( Time::MicrosecondsSinceBoot() < wakeat );
}
void SysUSleep(size_t usecs)
{
uintmax_t timetosleep = (uintmax_t) usecs;
uint32_t wakeat = Time::MicrosecondsSinceBoot() + timetosleep;
do { Yield(); }
while ( Time::MicrosecondsSinceBoot() < wakeat );
}
extern "C" void yield_cpu_handler();
extern "C" void thread_exit_handler();
void Init()
{
// We use a dummy so that the first context switch won't crash when the
// current thread is accessed. This lets us avoid checking whether it is
// NULL (which it only will be once), which gives simpler code.
dummythread = (Thread*) &dummythreaddata;
Maxsi::Memory::Set(dummythread, 0, sizeof(*dummythread));
dummythread->schedulerlistprev = dummythread;
dummythread->schedulerlistnext = dummythread;
currentthread = dummythread;
firstrunnablethread = NULL;
firstsleepingthread = NULL;
idlethread = NULL;
// Register our raw handler with user-space access. It calls our real
// handler after common interrupt preparation stuff has occured.
Interrupt::RegisterRawHandler(129, yield_cpu_handler, true);
Interrupt::RegisterHandler(129, InterruptYieldCPU, NULL);
Interrupt::RegisterRawHandler(132, thread_exit_handler, true);
Interrupt::RegisterHandler(132, ThreadExitCPU, NULL);
Syscall::Register(SYSCALL_SLEEP, (void*) SysSleep);
Syscall::Register(SYSCALL_USLEEP, (void*) SysUSleep);
}
} // namespace Scheduler
} // namespace Sortix