Here's an example with n = 800, k = 300. The times are not huge.

Original Message:

Sent: 09-02-2024 11:26

From: Virginia Rovnyak

Subject: A samping question

The question comes from a physical situation, where values of k would be in the 100-300 range, out of n=500-1,000. An asymptotic formula would be the best. The sampling will be done sequentially.

------------------------------

Virginia Rovnyak

Original Message:

Sent: 09-01-2024 02:47

From: Jim Baldwin

Subject: A samping question

Note that the exact inclusion probabilities for Anthony Warrack 's R code when k=2 can be obtained using his notation with

`w*(1 - w/(1 - w) + sum(w/(1 - w)))`

Here is some Mathematica code to obtain exact inclusion probabilities for other values of k. (Note that with even moderate values of k and n the calculations might take longer that the age of the earth.)

`(* Function to calculate probability of each sample *)`

`pr[indices_, p_] := Module[{remaining, prob},`

` remaining = 1 - p[[indices[[1]]]];`

` prob = p[[indices[[1]]]];`

` Do[prob = prob*p[[indices[[j]]]]/remaining; `

` remaining = remaining - p[[indices[[j]]]], {j, 2, Length[indices]}];`

` prob`

` ]`

` `

`(* Function to calculate inclusion probabilities *)`

`inclusionProbabilities[p_, k_] := Module[{perms, allProb, data},`

` If[k == 2, p (1 - p/(1 - p) + Total[p/(1 - p)]),`

` perms = Permutations[Range[Length[p]], {k}]; (* Generate all possible samples *)`

` allProb = pr[#, p] & /@ perms ; (* Get probabilities of every possible sample: these sum to 1 *)`

` data = Transpose[{perms, allProb}]; (* `

` Combine samples with associated probability of selection *)`

` (* Find the probabilities of inclusion for each element *)`

` Table[Total[Select[data, MemberQ[#[[1]], i] &][[All, 2]]], {i, 1, Length[p]}]]`

` ]`

` `

`(* Samples of size 3 *)`

`p = {1/12, 1/12, 2/12, 3/12, 3/12, 2/12};`

`inclusionProbabilities[p, 3]`

`(* {8203/27720,8203/27720,1819/3465,1255/1848,1255/1848,1819/3465} *)`

`inclusionProbabilities[p, 3] // N`

`(* {0.295924, 0.295924, 0.524964, 0.679113, 0.679113, 0.524964} *)`

Here is a slightly larger example:

`p = RandomVariate[UniformDistribution[{0, 1}], 20];`

`p = p/Total[p]`

`(* {0.0423276, 0.0266268, 0.0823042, 0.0855377, 0.0168512, 0.0535499, 0.068562, 0.0471883, 0.0100306,`

` 0.0546233, 0.0347716, 0.0926354, 0.00674148, 0.0390817, 0.0559113, 0.0806157, 0.0890967, 0.00176155, `

` 0.0600508, 0.0517321} *)`

`inclusionProbabilities[p, 4]`

`(* {0.176228, 0.113839, 0.318929, 0.329457, 0.0732076, 0.218636, 0.272497, 0.194821, 0.044056, 0.222596, `

` 0.146647, 0.352041, 0.0297645, 0.163622, 0.227326, 0.313371, 0.340871, 0.00783841, 0.242363, 0.211891} *)`

------------------------------

Jim Baldwin

Retired

Original Message:

Sent: 08-31-2024 13:10

From: Anthony Warrack

Subject: A samping question

Virginia, I think the following R program should provide reasonable estimates for the probabilities for each point (note: they do not add to one). Also note that points with the same weights should have the same selection probabilities. This could be achieved by averaging probabilities for points with the same weights

###### ####################

n <- 6 ; k <- 2 # choose n and k

x <- 1:n # number the points from 1 to n

w <- c(1/12,1/12,2/12,3/12,3/12,2/12) # given weights, assumed to be sampling probabilities

nsim <- 10000 # choose number of simulations

M <- matrix(nrow = nsim,ncol = k, byrow = TRUE) # M is nsim by k, lists points selected for each simulation

for(i in 1:nsim){

y <- sample(x,k,w,replace = F)

M[i,] <- y

}

MT <- table(M)

MT/nsim #gives the estimated probability for each point to be selected

###################

------------------------------

[Giles] [Warrack]

[Retired]

[North Carolina A&T State University]

Original Message:

Sent: 08-29-2024 10:26

From: Paul Auclair

Subject: A samping question

Virginia, here's a reply from Perplexity.AI that describes the problem, outlines a solution, and provides some references.

https://www.perplexity.ai/search/a-colleague-is-interested-in-t-uckvx4wMSrKXKnhIyD3blw

------------------------------

Paul Auclair

Corporate Operations Research Analyst

LinQuest Corporation

Original Message:

Sent: 08-28-2024 22:12

From: Virginia Rovnyak

Subject: A samping question

A colleague is interested in the following situation. There is a finite set of n points, each with a certain weight. A weighted random sample of k points is drawn without replacement. What is the probability that a given x will be in this sample?** **

**Is this a known problem?** Any information or references would be appreciated.

------------------------------

Virginia Rovnyak

------------------------------