From: Bruce Momjian Date: Fri, 28 Sep 2001 19:06:50 +0000 (+0000) Subject: Add to thread thread. X-Git-Tag: REL7_2_BETA1~300 X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=7fb60b06ffd748d8fe38136c9db5ca66b62cfeaf;p=postgresql Add to thread thread. --- diff --git a/doc/TODO.detail/thread b/doc/TODO.detail/thread index 5a4d3cfa2f..3ba4176b4f 100644 --- a/doc/TODO.detail/thread +++ b/doc/TODO.detail/thread @@ -951,3 +951,479 @@ good and what is not matters, but it is good for another view point). ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org +From pgsql-hackers-owner+M13607=candle.pha.pa.us=pgman@postgresql.org Wed Sep 26 19:14:59 2001 +Return-path: +Received: from server1.pgsql.org (server1.pgsql.org [64.39.15.238] (may be forged)) + by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id f8QNExo15536 + for ; Wed, 26 Sep 2001 19:14:59 -0400 (EDT) +Received: from postgresql.org (webmail.postgresql.org [216.126.85.28]) + by server1.pgsql.org (8.11.6/8.11.6) with ESMTP id f8QNF8423944 + for ; Wed, 26 Sep 2001 18:15:09 -0500 (CDT) + (envelope-from pgsql-hackers-owner+M13607=candle.pha.pa.us=pgman@postgresql.org) +Received: from belphigor.mcnaught.org ([216.151.155.121]) + by postgresql.org (8.11.3/8.11.4) with ESMTP id f8QMe3h07256 + for ; Wed, 26 Sep 2001 18:40:04 -0400 (EDT) + (envelope-from doug@wireboard.com) +Received: (from doug@localhost) + by belphigor.mcnaught.org (8.11.6/8.9.3) id f8QMdkB05502; + Wed, 26 Sep 2001 18:39:46 -0400 +X-Authentication-Warning: belphigor.mcnaught.org: doug set sender to doug@wireboard.com using -f +To: "D. Hageman" +cc: mlw , + "pgsql-hackers@postgresql.org" +Subject: Re: [HACKERS] Spinlock performance improvement proposal +References: +From: Doug McNaught +Date: 26 Sep 2001 18:39:44 -0400 +In-Reply-To: "D. Hageman"'s message of "Wed, 26 Sep 2001 16:14:22 -0500 (CDT)" +Message-ID: +Lines: 26 +User-Agent: Gnus/5.0806 (Gnus v5.8.6) XEmacs/21.1 (20 Minutes to Nikko) +MIME-Version: 1.0 +Content-Type: text/plain; charset=us-ascii +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +Status: OR + +"D. Hageman" writes: + +> Save for the fact that the kernel can switch between threads faster then +> it can switch processes considering threads share the same address space, +> stack, code, etc. If need be sharing the data between threads is much +> easier then sharing between processes. + +This depends on your system. Solaris has a huge difference between +thread and process context switch times, whereas Linux has very little +difference (and in fact a Linux process context switch is about as +fast as a Solaris thread switch on the same hardware--Solaris is just +a pig when it comes to process context switching). + +> I can't comment on the "isolate data" line. I am still trying to figure +> that one out. + +I think his point is one of clarity and maintainability. When a +task's data is explicitly shared (via shared memory of some sort) it's +fairly clear when you're accessing shared data and need to worry about +locking. Whereas when all data is shared by default (as with threads) +it's very easy to miss places where threads can step on each other. + +-Doug +-- +In a world of steel-eyed death, and men who are fighting to be warm, +Come in, she said, I'll give you shelter from the storm. -Dylan + +---------------------------(end of broadcast)--------------------------- +TIP 2: you can get off all lists at once with the unregister command + (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) + +From pgsql-hackers-owner+M13611=candle.pha.pa.us=pgman@postgresql.org Wed Sep 26 21:05:02 2001 +Return-path: +Received: from server1.pgsql.org (server1.pgsql.org [64.39.15.238] (may be forged)) + by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id f8R152o22010 + for ; Wed, 26 Sep 2001 21:05:02 -0400 (EDT) +Received: from postgresql.org (webmail.postgresql.org [216.126.85.28]) + by server1.pgsql.org (8.11.6/8.11.6) with ESMTP id f8R158430261 + for ; Wed, 26 Sep 2001 20:05:08 -0500 (CDT) + (envelope-from pgsql-hackers-owner+M13611=candle.pha.pa.us=pgman@postgresql.org) +Received: from sss.pgh.pa.us ([192.204.191.242]) + by postgresql.org (8.11.3/8.11.4) with ESMTP id f8R0lgh29430 + for ; Wed, 26 Sep 2001 20:47:42 -0400 (EDT) + (envelope-from tgl@sss.pgh.pa.us) +Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) + by sss.pgh.pa.us (8.11.4/8.11.4) with ESMTP id f8R0kpK14707; + Wed, 26 Sep 2001 20:46:51 -0400 (EDT) +To: Ian Lance Taylor +cc: "D. Hageman" , mlw , + "pgsql-hackers@postgresql.org" +Subject: Re: [HACKERS] Spinlock performance improvement proposal +In-Reply-To: +References: +Comments: In-reply-to Ian Lance Taylor + message dated "26 Sep 2001 15:04:41 -0700" +Date: Wed, 26 Sep 2001 20:46:51 -0400 +Message-ID: <14704.1001551611@sss.pgh.pa.us> +From: Tom Lane +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +Status: OR + +Ian Lance Taylor writes: +> (Actually, though, Postgres is already vulnerable to erratic behaviour +> because any backend process can corrupt the shared buffer pool.) + +Not to mention the other parts of shared memory. + +Nonetheless, our experience has been that cross-backend failures due to +memory clobbers in shared memory are very infrequent --- certainly far +less often than we see localized-to-a-backend crashes. Probably this is +because the shared memory is (a) small compared to the rest of the +address space and (b) only accessed by certain specific modules within +Postgres. + +I'm convinced that switching to a thread model would result in a +significant degradation in our ability to recover from coredump-type +failures, even given the (implausible) assumption that we introduce no +new bugs during the conversion. I'm also *un*convinced that such a +conversion will yield significant performance benefits, unless we +introduce additional cross-thread dependencies (and more fragility +and lock contention) by tactics such as sharing catalog caches across +threads. + + regards, tom lane + +---------------------------(end of broadcast)--------------------------- +TIP 3: if posting/reading through Usenet, please send an appropriate +subscribe-nomail command to majordomo@postgresql.org so that your +message can get through to the mailing list cleanly + +From pgsql-hackers-owner+M13616=candle.pha.pa.us=pgman@postgresql.org Wed Sep 26 23:10:52 2001 +Return-path: +Received: from server1.pgsql.org (server1.pgsql.org [64.39.15.238] (may be forged)) + by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id f8R3Aqo03180 + for ; Wed, 26 Sep 2001 23:10:52 -0400 (EDT) +Received: from postgresql.org (webmail.postgresql.org [216.126.85.28]) + by server1.pgsql.org (8.11.6/8.11.6) with ESMTP id f8R3B3438816 + for ; Wed, 26 Sep 2001 22:11:03 -0500 (CDT) + (envelope-from pgsql-hackers-owner+M13616=candle.pha.pa.us=pgman@postgresql.org) +Received: from spider.pilosoft.com (p55-222.acedsl.com [160.79.55.222]) + by postgresql.org (8.11.3/8.11.4) with ESMTP id f8R2vCh48923 + for ; Wed, 26 Sep 2001 22:57:12 -0400 (EDT) + (envelope-from alex@pilosoft.com) +Received: from localhost (alexmail@localhost) + by spider.pilosoft.com (8.9.3/8.9.3) with ESMTP id WAA27630; + Wed, 26 Sep 2001 22:58:41 -0400 (EDT) +Date: Wed, 26 Sep 2001 22:58:41 -0400 (EDT) +From: Alex Pilosov +To: "D. Hageman" +cc: "pgsql-hackers@postgresql.org" +Subject: Re: [HACKERS] Spinlock performance improvement proposal +In-Reply-To: +Message-ID: +MIME-Version: 1.0 +Content-Type: TEXT/PLAIN; charset=US-ASCII +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +Status: OR + +On Wed, 26 Sep 2001, D. Hageman wrote: + +> > > Save for the fact that the kernel can switch between threads faster then +> > > it can switch processes considering threads share the same address space, +> > > stack, code, etc. If need be sharing the data between threads is much +> > > easier then sharing between processes. +> > +> > When using a kernel threading model, it's not obvious to me that the +> > kernel will switch between threads much faster than it will switch +> > between processes. As far as I can see, the only potential savings is +> > not reloading the pointers to the page tables. That is not nothing, +> > but it is also + +> > > I can't comment on the "isolate data" line. I am still trying to figure +> > > that one out. +> > +> > Sometimes you need data which is specific to a particular thread. +> +> When you need data that is specific to a thread you use a TSD (Thread +> Specific Data). +Which Linux does not support with a vengeance, to my knowledge. + +As a matter of fact, quote from Linus on the matter was something like +"Solution to slow process switching is fast process switching, not another +kernel abstraction [referring to threads and TSD]". TSDs make +implementation of thread switching complex, and fork() complex. + +The question about threads boils down to: Is there far more data that is +shared than unshared? If yes, threads are better, if not, you'll be +abusing TSD and slowing things down. + +I believe right now, postgresql' model of sharing only things that need to +be shared is pretty damn good. The only slight problem is overhead of +forking another backend, but its still _fast_. + +IMHO, threads would not bring large improvement to postgresql. + + Actually, if I remember, there was someone who ported postgresql (I think +it was 6.5) to be multithreaded with major pain, because the requirement +was to integrate with CORBA. I believe that person posted some benchmarks +which were essentially identical to non-threaded postgres... + +-alex + + +---------------------------(end of broadcast)--------------------------- +TIP 6: Have you searched our list archives? + +http://archives.postgresql.org + +From pgsql-hackers-owner+M13619=candle.pha.pa.us=pgman@postgresql.org Thu Sep 27 00:32:55 2001 +Return-path: +Received: from server1.pgsql.org (server1.pgsql.org [64.39.15.238] (may be forged)) + by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id f8R4Wto07075 + for ; Thu, 27 Sep 2001 00:32:55 -0400 (EDT) +Received: from postgresql.org (webmail.postgresql.org [216.126.85.28]) + by server1.pgsql.org (8.11.6/8.11.6) with ESMTP id f8R4X7444942 + for ; Wed, 26 Sep 2001 23:33:07 -0500 (CDT) + (envelope-from pgsql-hackers-owner+M13619=candle.pha.pa.us=pgman@postgresql.org) +Received: from sss.pgh.pa.us ([192.204.191.242]) + by postgresql.org (8.11.3/8.11.4) with ESMTP id f8R4Jsh61257 + for ; Thu, 27 Sep 2001 00:19:54 -0400 (EDT) + (envelope-from tgl@sss.pgh.pa.us) +Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) + by sss.pgh.pa.us (8.11.4/8.11.4) with ESMTP id f8R4JLK15406; + Thu, 27 Sep 2001 00:19:21 -0400 (EDT) +To: "D. Hageman" +cc: Alex Pilosov , + "pgsql-hackers@postgresql.org" +Subject: Re: [HACKERS] Spinlock performance improvement proposal +In-Reply-To: +References: +Comments: In-reply-to "D. Hageman" + message dated "Wed, 26 Sep 2001 22:41:39 -0500" +Date: Thu, 27 Sep 2001 00:19:20 -0400 +Message-ID: <15403.1001564360@sss.pgh.pa.us> +From: Tom Lane +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +Status: OR + +"D. Hageman" writes: +> If you look at Myron Scott's post today you will see that it had other +> advantages going for it (like auto-vacuum!) and disadvantages ... rogue +> thread corruption (already debated today). + +But note that Myron did a number of things that are (IMHO) orthogonal +to process-to-thread conversion, such as adding prepared statements, +a separate thread/process/whateveryoucallit for buffer writing, ditto +for vacuuming, etc. I think his results cannot be taken as indicative +of the benefits of threads per se --- these other things could be +implemented in a pure process model too, and we have no data with which +to estimate which change bought how much. + +Threading certainly should reduce the context switch time, but this +comes at the price of increased overhead within each context (since +access to thread-local variables is not free). It's by no means +obvious that there's a net win there. + + regards, tom lane + +---------------------------(end of broadcast)--------------------------- +TIP 6: Have you searched our list archives? + +http://archives.postgresql.org + +From pgsql-hackers-owner+M13621=candle.pha.pa.us=pgman@postgresql.org Thu Sep 27 01:59:44 2001 +Return-path: +Received: from server1.pgsql.org (server1.pgsql.org [64.39.15.238] (may be forged)) + by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id f8R5xio11898 + for ; Thu, 27 Sep 2001 01:59:44 -0400 (EDT) +Received: from postgresql.org (webmail.postgresql.org [216.126.85.28]) + by server1.pgsql.org (8.11.6/8.11.6) with ESMTP id f8R5xi449748 + for ; Thu, 27 Sep 2001 00:59:45 -0500 (CDT) + (envelope-from pgsql-hackers-owner+M13621=candle.pha.pa.us=pgman@postgresql.org) +Received: from goldengate.kojoworldwide.com. ([216.133.4.130]) + by postgresql.org (8.11.3/8.11.4) with ESMTP id f8R5joh75612 + for ; Thu, 27 Sep 2001 01:45:50 -0400 (EDT) + (envelope-from mscott@sacadia.com) +Received: from localhost (localhost [127.0.0.1]) + by goldengate.kojoworldwide.com. (8.9.1b+Sun/8.9.2) with ESMTP id WAA01144 + for ; Wed, 26 Sep 2001 22:24:29 -0700 (PDT) +Date: Wed, 26 Sep 2001 22:24:29 -0700 (PDT) +From: Myron Scott +X-Sender: mscott@goldengate.kojoworldwide.com. +To: pgsql-hackers@postgresql.org +Subject: Re: [HACKERS] Spinlock performance improvement proposal +In-Reply-To: <15403.1001564360@sss.pgh.pa.us> +Message-ID: +MIME-Version: 1.0 +Content-Type: TEXT/PLAIN; charset=US-ASCII +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +Status: OR + + + +> But note that Myron did a number of things that are (IMHO) orthogonal + +yes, I did :) + +> to process-to-thread conversion, such as adding prepared statements, +> a separate thread/process/whateveryoucallit for buffer writing, ditto +> for vacuuming, etc. I think his results cannot be taken as indicative +> of the benefits of threads per se --- these other things could be +> implemented in a pure process model too, and we have no data with which +> to estimate which change bought how much. +> + +If you are comparing just process vs. thread, I really don't think I +gained much for performance and ended up with some pretty unmanageable +code. + +The one thing that led to most of the gains was scheduling all the writes +to one thread which, as noted by Tom, you could do on the process model. +Besides, Most of the advantage in doing this was taken away with the +addition of WAL in 7.1. + +The other real gain that I saw with threading was limiting the number of +open files but +that led me to alter much of the file manager in order to synchronize +access to the files which probably slowed things a bit. + +To be honest, I don't think I, personally, +would try this again. I went pretty far off +the beaten path with this thing. It works well for what I am doing +( a limited number of SQL statements run many times over ) but there +probably was a better way. I'm thinking now that I should have tried to +add a CORBA interface for connections. I would have been able to +accomplish my original goals without creating a deadend for myself. + + +Thanks all for a great project, + +Myron +mscott@sacadia.com + + +---------------------------(end of broadcast)--------------------------- +TIP 2: you can get off all lists at once with the unregister command + (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) + +From pgsql-hackers-owner+M13632=candle.pha.pa.us=pgman@postgresql.org Thu Sep 27 10:21:22 2001 +Return-path: +Received: from server1.pgsql.org (server1.pgsql.org [64.39.15.238] (may be forged)) + by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id f8RELLo08607 + for ; Thu, 27 Sep 2001 10:21:21 -0400 (EDT) +Received: from postgresql.org (webmail.postgresql.org [216.126.85.28]) + by server1.pgsql.org (8.11.6/8.11.6) with ESMTP id f8RELP487000 + for ; Thu, 27 Sep 2001 09:21:26 -0500 (CDT) + (envelope-from pgsql-hackers-owner+M13632=candle.pha.pa.us=pgman@postgresql.org) +Received: from gromit.dotclick.com (ipn9-f8366.net-resource.net [216.204.83.66]) + by postgresql.org (8.11.3/8.11.4) with ESMTP id f8RE49h21870 + for ; Thu, 27 Sep 2001 10:04:09 -0400 (EDT) + (envelope-from markw@mohawksoft.com) +Received: from mohawksoft.com (IDENT:markw@localhost.localdomain [127.0.0.1]) + by gromit.dotclick.com (8.9.3/8.9.3) with ESMTP id KAA24417; + Thu, 27 Sep 2001 10:02:06 -0400 +Message-ID: <3BB3315D.EC99FF65@mohawksoft.com> +Date: Thu, 27 Sep 2001 10:02:05 -0400 +From: mlw +X-Mailer: Mozilla 4.75 [en] (X11; U; Linux 2.4.2 i686) +X-Accept-Language: en +MIME-Version: 1.0 +To: "D. Hageman" +cc: Ian Lance Taylor , + "pgsql-hackers@postgresql.org" +Subject: Re: [HACKERS] Spinlock performance improvement proposal +References: +Content-Type: text/plain; charset=us-ascii +Content-Transfer-Encoding: 7bit +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +Status: OR + +"D. Hageman" wrote: + +> On 26 Sep 2001, Ian Lance Taylor wrote: +> > +> > > Save for the fact that the kernel can switch between threads faster then +> > > it can switch processes considering threads share the same address space, +> > > stack, code, etc. If need be sharing the data between threads is much +> > > easier then sharing between processes. +> > +> > When using a kernel threading model, it's not obvious to me that the +> > kernel will switch between threads much faster than it will switch +> > between processes. As far as I can see, the only potential savings is +> > not reloading the pointers to the page tables. That is not nothing, +> > but it is also not a lot. +> +> It is my understanding that avoiding a full context switch of the +> processor can be of a significant advantage. This is especially important +> on processor architectures that can be kinda slow at doing it (x86). I +> will admit that most modern kernels have features that assist software +> packages utilizing the forking model (copy on write for instance). It is +> also my impression that these do a good job. I am the kind of guy that +> looks towards the future (as in a year, year and half or so) and say that +> processors will hopefully get faster at context switching and more and +> more kernels will implement these algorithms to speed up the forking +> model. At the same time, I see more and more processors being shoved into +> a single box and it appears that the threads model works better on these +> type of systems. + +"context" switching happens all the time on a multitasking system. On the x86 +processor, a context switch happens when you call into the kernel. You have to go +through a call-gate to get to a lower privilege ring. "context" switching is very +fast. The operating system dictates how heavy or light a process switch is. Under +Linux (and I believe FreeBSD with Linux threads, or version 4.x ) threads and +processes are virtually identical. The only difference is that the virtual memory +pages are not "copy on write." Process vs thread scheduling is also virtually +identical. + +If you look to the future, then you should accept that process switching should +become more efficient as the operating systems improve. + +> +> > > I can't comment on the "isolate data" line. I am still trying to figure +> > > that one out. +> > +> > Sometimes you need data which is specific to a particular thread. +> +> When you need data that is specific to a thread you use a TSD (Thread +> Specific Data). + +Yes, but Postgres has many global variables. The assumption has always been that +it is a stand-alone process with an explicitly shared paradigm, not implicitly. + +> +> > Basically, you have to look at every global variable in the Postgres +> > backend, and determine whether to share it among all threads or to +> > make it thread-specific. +> +> Yes, if one was to implement threads into PostgreSQL I would think that +> some re-writing would be in order of several areas. Like I said before, +> give a person a chance to restructure things so future TODO items wouldn't +> be so hard to implement. Personally, I like to stay away from global +> variables as much as possible. They just get you into trouble. + +In real live software, software which lives from year to year with active +development, things do get messy. There are always global variables involved in a +program. Efforts, of course, should be made to keep them to a minimum, but the +reality is that they always happen. + +Also, the very structure of function calls may need to change when going from a +process model to a threaded model. Functions never before reentrant are now be +reentrant, think about that. That is a huge undertaking. Every single function +may need to be examined for thread safety, with little benefit. + +> +> > > That last line is a troll if I every saw it ;-) I will agree that threads +> > > isn't for everything and that it has costs just like everything else. Let +> > > me stress that last part - like everything else. Certain costs exist in +> > > the present model, nothing is - how should we say ... perfect. +> > +> > When writing in C, threading inevitably loses robustness. Erratic +> > behaviour by one thread, perhaps in a user defined function, can +> > subtly corrupt the entire system, rather than just that thread. Part +> > of defensive programming is building barriers between different parts +> > of a system. Process boundaries are a powerful barrier. +> +> I agree with everything you wrote above except for the first line. My +> only comment is that process boundaries are only *truely* a powerful +> barrier if the processes are different pieces of code and are not +> dependent on each other in crippling ways. Forking the same code with the +> bug in it - and only 1 in 5 die - is still 4 copies of buggy code running +> on your system ;-) + +This is simply not true. All software has bugs, it is an undeniable fact. Some +bugs are more likely to be hit than others. 5 processes , when one process hits a +bug, that does not mean the other 4 will hit the same bug. Obscure bugs kill +software all the time, the trick is to minimize the impact. Software is not +perfect, assuming it can be is a mistake. + + + + + +---------------------------(end of broadcast)--------------------------- +TIP 5: Have you checked our extensive FAQ? + +http://www.postgresql.org/users-lounge/docs/faq.html +